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PREFACE TO SECON'D.EDITION 

f m 

This book ar£>se out of an attempt, published m 1903, t«o us% 
Professor Pearson’s system of frequency curves for the gradua- 
tion of Mortality Tables The subject was then unfamiliar to^ 
actuaries, an<f the Institute pf Actuaries encouraged me to 
write a book on the subjectf'arranged for its publication and 
reheved me of any expense m connection with it. My gratitude 
is not only for the broad-mindedness with which a professional 
body approached recent research, but also for the help and 
encouragement given to a young, untried and mexperienced 
member of the profession Nor does it end there for when the 
original edition was nearly exhausted the Institute generously 
handed over the copyright and left me with a free hand as 
regards the future 

In dealing with frequency curves and curve fitting, new 
matter has been brought m and the order of treatment of the 
curves has been altered so that mam types are less likely to be 
confused with the transition and minor types. A chapter is 
devoted to a comparison of the Pearson curves with the series 
suggested for use by Edgeworth in this country and by many 
continental writers The chapters on correlation, contingency, 
etc. have been largely rewritten and a new chapter on partial 
correlation added. 

The book as it now stands assumes that the reader is famihar 
with the Primer of Statistics or some other very elementary 
book. It demands no mathematical knowledge beyond that 
required for the first examination of the Institute of Actuaries 
or the Intermediate Examination for the B.Sc of London 
University The subject is, however, statistical and arith- 
metical, and examples must be worked out if the methods and 
principles are to be mastered The reader who goes through a 
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book cfii a practical Subject find does not work out examples 
is as certain to eiiccfunter imaginary and miss real difficulties 
as he is f to fail to r obtam any satisfactory kn/yriedgc of the 
subject. 

Even if a reader dpes not possess the mathematical equip- 
ment indicated v he can use frequency curves and correlation 
reasonably without it, for the faci that a eurvoHio has found 
agrees with the statistics from which the moments were 
.obtained is a proof that, in the particular case, he has obtained 
proper values for the constants, even though he has not followed 
the mathematical reasoning leading to the r equations It must 
not be inferred that belief without proof is advisable, but that 
it is unwise for a practical man to put aside a practical subject 
which he can test practically merely because he cannot follow 
some of the proofs. There is another class of statistical students 
whoso wants may be mentioned f refer to those whorhavo 
little need to study graduation and curve fitting in detail, but 
require a knowledge of correlation, probable errors, etc. For 
the sake of these readers an abridged reading is suggested in 
the Appendix 

Frequency curves, correlation and sampling form a subject 
in which there is still a groat deal to be done, notwithstanding 
the progress that lias been made in recent years. Much of this 
work has been highly mathematical, especially when it deals 
with certain small samples or with attempts to find mathe- 
matical expressions for skew correlation sur faces, 1 These aspects 
lie outside such a hook as tins, and, even if we neglect them, 
we may still say that there are low subjects that offer a richer 
field for original work, fn this field the reader will find that 
during the past thirty years wo are indebted to Professor Karl 
Pearson and his school for much of the work that lias proved 
a success m practice, and anyone writing on the subject for 
practical men is bound to follow in Jus footsteps. Only those 
who become interested in the subject and study Professor 
Pearson’s original work will appreciate the great extent of his 
contribution to statistical science. 
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I hope that themumerical examples in thSs book, and similar 
arithmetical worK done elsewhere, may tend to show that 
actuarial statistics can b^ examined m the same way as the 
statistics of biology, anthropology or sociology. May not such 
work add some links to the chain of contyiuity and indicate a 
wider law than an actuary studying his own subject exclusively 
might be led to suspect ^ 

As will be readily appreciated, I am chiefly indebted to 
Professor Pearson, but tire indebtedness is of a kind for which 
it is impossible to offer formahthanks, s^ch thanks would, at 
their best, fail to express the s CnSe of gratitude which prompted 
them 

The revision of the book has been a reminder of much kind 
help received m connection with the first edition from Mr G J 
Lidstone and Mr John Spencer, both of whom read the work m 
a somewhat different form m MS and made many valuable 
suggestions, and from Messrs S Adlard and R L Elderton, 
who then spent much time in reading proofs and suggested 
difficulties that would probably arise and ways of removing 
them In connection with the new edition, Mr H B. Smither 
has helped with some of the calculations, and both he and 
Mr H T Adlard have read the book m proof, help for which I 
am, indeed, grateful At many stages m the work my sister, 
Miss Ethel M Elderton, has come to my aid, and, bearing m 
mmd her experience m teaching the subject as well as her 
practical work, it would have been better if the book had been 
hers and not mine any improvement m this edition is probably 
hers already 

W P E. 

19 COLEMAN STREET 
LONDON, E C 2 


July 1927 




PREFACE TO THIRD EDITION 

r r ^ 

The book ha^ been alteredm many respects, "and Chapters x^ 
xi and xh and some of the Appendices have been rewritten 
The notation for moments in the earlier editions has been 
retained. Sonee writers find it helpful to use distinct symbols 
for the “theoretical moment*” and the*“ adjusted statistical 
moment In practical curve fitting the two are equated 
The notation I use treats the latter as identical with the 
former Readers of other work, and especially of contmental 
work, must bear m mmd these and other differences in 
notation 

I am most grateful to Professor E S Pearson for the help 
and advice he has given me so generously and sympa- 
thetically. It is also a pleasure to thank Mr H Latham Seal 
for many suggestions and him and Mr H J. Tappenden for 
much help in connection with the proofs 

I hope these kind friends will not be thought to be m any 
way responsible for my shortcomings. 

W P.E. 


October 1937 
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CHAPTER I 
INTRODUCTORY 

1. The ordinary treatment of probability begins with the 
assumption that the chance that a certain event will occur is^ 
known, and proceeds to solve the problems that ans% from 
the combinatiqp of eveifts or the repetition of a particular 
experiment, it proves that a Certain result is more likely to 
occur from experiment than any other, that a result based on 
a limited number of trials is unlikely to differ greatly from 
the expected result, and that the proportional deviation from 
the most probable result will generally decrease as the number 
of trials is increased 

Experiments can easily be made to show that the theoretical 
method leads to results which can be realised m practice when 
the probabilities can be estimated accurately beforehand, for 
example, various trials have been made with com tossing in 
which it has been found that if five coins are tossed together 
and the number of them coming down 4 ‘heads” is recorded, 
then the distribution of the cases will agree with the binomial 
expansion (| + |) 5 as the ordinary theory leads us to expect. 
Sequences of “heads” or “tails” form a series approximating 
to the geometrical progression with a common ratio of and 
the drawing of cards from a pack gives a result closely agreeing 
with the numbers that theoretical work suggests. 

2. It frequently happens, however, that the probabilities 
are not known, and it is impossible to tell whether we are 
dealing with an experiment like coin tossing or sequences or 
card-drawing, in fact, the only thing known is the distribution 
of the number of cases into certain groups, and in these circum- 
stances the inverse problem of tracing the theoretical series 
to which the statistics approximate may become an important 
matter. The difficulty of the subject is increased because 
statistics do not give the theoretical distribution exactly, and 
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it is impossible £o t dl where the differences between the actual 
and theoretical results lie To mak$ the position clearer it will 
be well' to restate' the pioblem ant* ask whether it is possible , 
to find the theoretical senes to which a series, resulting from 
a statistical experiment, approximates. It may bo difficult, 
perhaps impossible, to trace the probabilities ^corresponding 
r to a given case, but yet practicable to form a reasonable^. 
opinion of the senes of numbers that might be reached if 
the experiment could be repeated aif mtimto number of times. 
On turning to the reasons whidh make it advisable to find this 
ideal result to which statistics approach, it will be seen that 
the exact elementary probabilities are not of supreme impor- 
tance, and a reasonable representation of the series is of far 
greater practical value. We notice that one of the first objects 
of a statistician or an actuary dealing with statistical work 
is to express the observations in a simple form so that practical 
conclusions can be easily drawn from the figures that have 
been collected. II the available statistics fall naturally into 
fifty or sixty groups, ho has to decide how they can be arranged 
to bring out the important features ol the problem on which 
he is working, whereas if he can find a few numbers closely 
connected with the original series which can be used as an index 
to the whole, ho can then give the result m a way that might 
assist comparison with similar statistics, and enable others 
who have to deal with the facts to appreciate the whole dis- 
tribution more readily than they could do if it remained in its 
original form The statistician has also to supply approximate 
values for intermediate terms when only a few can be obt ained 
from his experience, or complete or continue a series when only 
a part of it is known, Ln many cases he has to keep the same 
terms as m his original scries, but remove the roughnesses of 
material due to limitations m the number of cases available 
for lus investigation, that is, he has to graduate his data, 

3. In reality these objects arc much alike, for if the statis- 
tical tables can be represented by an algebraic or transcen- 
dental formula, we can replace the whole series of numbers 
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by a few values (the constants m the formula) ydnch, if we deal 
systematically with the distributions we meet, facilitate com- 
parison or enable us to supply missing term&, while thO rough- 
ness of the original material can be removed by making a 
suitable formula represent £he original satisfies as nearly as 
possible. If ajEormula is bas»ed on theoretical considerations, 

^ it may also give a solution of the problem m probabilities 1 
mentioned at the outset, and we see that both the practical 
and theoretical^ requirements can be dealt with at the same 
time, for the smooth series sought by ths theoretical student 
is the same thing as the formula required for practical work. 

4. The advantages of any system of curves depend on the 
simplicity of the formulae and the number of classes of 
observations that can be dealt with satisfactorily, for a com- 
plicated expression is very little improvement on the original 
groups'* of statistics, and a system which is not capable of 
general apphcation leaves the statistician in difficulties when- 
ever it breaks down. One other thing is necessary; if a formula 
is known to be a suitable one, there must be some method of 
finding the arithmetical constants that will give a good agree- 
ment in the particular case Such a method, if it is to be of 
practical use, must be simple, reliable and capable of general 
and systematic apphcation 

A broad idea of the objects to be accomplished ought to be 
kept clearly before the mmd, they are likely to be forgotten 
because of the large amount of detail necessarily connected 
with the subject It is also important because the advantages 
of systematic treatment are often overlooked, and short cuts 
and rough and ready methods are adopted to the detriment of 
the work, and formulae having no scientific basis and having 
no connection with others suitable to similar cases are some- 
times used in rather haphazard fashion by statisticians The 
consequence is that generalisation is impossible, and where 
a law might be found one can see little but a great variety of 
attempts by energetic workers to reach then? own conclusions 
regardless of the value of comparative statistics 
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CHAPTER II 


FREQUENCY DMTR1UUTIQNS 

1. ,If statistics are arranged so a,s to show the number of 
times, or frequency with which, an r e vent happens in a parti- 
cular way, then thei arrangement is a frequency distribution. 
Although some of our results ViQ be of wider applicability, we 
shall generally confine our attention to these distributions. 

2. It is necessary to have a name for the formula used to 
describe such distributions, and the term “ frequency-curve ” 
has been adopted for the purpose. The geometrical progression 
which describes the number of sequences in any diroct'oxperi- 
ment, such as coin tossing or dice throwing, is, in the limit, 
a frequency-curve, the equation to which is // — Na J . 

3. Somo distributions give the number of cases falling in 
a certain group of values of the independent variable, while 
others (e.g. Example V of Table L) give the number of cases 
for an exact value. In the former case the exact values of 
the independent variable to which the groups correspond must 
be considered, for instance, “exposed to risk at ago a'” includes 
those from x - £ to x + £ , but the number of deaths at duration 
n those from n to n+ 1. When statistics are represented 
graphically, effect should bo given to these differences, and, to 
bring out the points a little more clearly, the diagrams on 
pp. 5 and t> have boon prepared The drawings of distributions, 
such as those in the diagrams, are called frequency polygons 
or histograms. 

4. When statistics give the number of cases for an exact 
value of the independent variable, it is simple to plot them 
in a diagram by drawing ordinates and joining their tops, 
but m the case of groups of values there is a little complication, 
for we can either draw a rectangle standing on the entire base 
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(Example II of diagram) or put in ordinates at the middle 
points of the bases and t*ien join their to^s (Example III) 
The former method gives the correct idea of the amount of 
information conveyed by the statistics, but, for some purposes 
(e g *for seeing the possible ®shape of the^urve), the latter is 
more convenient, though it* is open to techmcal objection. 
Cases such as Examples I and IV are best expressed by the 
kind of drawing given, yhile Example III though opfcn to 




techmcal objection gives a better indication to most people 
of the shape of the actual distribution than a block 
diagram 

5 . The reader is no doubt already familiar with the fact that 
statistics tend towards a smooth series as the total number of 
cases is increased, and from this it can be seen how naturally 
practical statistics lead to the conception of a frequency-curve 
to describe the smooth distribution that would be obtamed if 
an infinite supply of homogeneous material were available for 
investigation. In other words, such curves would give an 
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approximation to the total “population’^ of; which the par- 
ticular case investigated vjas a sample. 

• • * 


Table I 


^ 

Example I 

Example II * 

* 

Example III 

Example IV 

Example V 

Curtate 

durations 

Withdiawals 
with monthly 
incidence “O’ 5 
m year of exit, 
Principles 
and Methods 
(P 92) 

Ages 

% 

Exposed to 
risk of 
sickness 
(Watson, 

M U Tables , 
p 19) - 

Existing at 
close of 
observations 
Without 
Profit “Old” 
Assurances , 

Existing at 
close of 
observations 
“Old” 
Annuities 
(females) 

Teims of the 
expaiTsion of 
1000(| +£) 12 

% 

No of 
term 

1 

308 

-19 

34 



32 

1 

2 

200 

20-24 

145 



127 

2 

3 

118 

25-29 

156 



232 

3 

4 

69 

30-34 

145 



258 

4 

5 

59 

35-39 

123 



194 

5 

6 

* 44 

40-44 

103 

3 


103 

6 

7 

29 

45-49 

86 

9 


40 

7 

8 

28 

50-54 

71 

42 


11 

8 

* 9 

* 26 

55-59 

55 

111 

29 

2 

9 

10 

21 

60-G4 

37 

176 

23 

1 

10 

11 

18 

65-69 

21 

200 

81 


11 

12 

18 

70-74 

13 

193 

151 



13 

12 

75-79 

7 

160 

192 



14 

11 

80-84 

3 

73 

239 



15 

5 

85-89 

1 

2G 

157 



1G 

11 

90-94 


6 

93 



17 

7 

95-99 


1 

29 



18 

6 

100- 



6 



19 

1 







20 

3 







21 

1 







22 

3 







23 

2 








1,000 


1,000 

1,000 

1,000 

1,000 


True total 
Mean 

Stand aid 
deviation 
Type 

1,308 

4182 

4 1996 

I 


2,995,724 

37 8750 

2 76810 

I 

2,674 

G8 485 

1 771288 

II 

172 

79 400 

1 774894 

VII 

3 998 

1 46215 



6. It may be noticed that a frequency-curve can be inter- 
preted to give a frequency corresponding to every value of the 
independent variable along the whole range of the distribution, 
and will not restrict us to a few more or less arbitrary groups 
as is necessary with actual statistics. The binomial series and 
geometrical progression do the same when we imagme we are 
dealing with something that can be divided into a very large 




n um ber of groups (Thus, if we mix a large quantity of sand of 
two colours and take out a fixed qiyintity of the mixture and 
record the number of grams of sand of eitlief colour in each 
drawing, we should obtain a continuous eurvo from a large 
number of trials. « , ' 

7. Wo will now define sorno fmiportant filiations. When 
'a distribution is arranged according to the progressive values 
of a ^variable characteristic, e g duration, ago, etc , the 
average value of that characteristic'" (not the -average of the 
frequencies) is callechtlie mean of the distribution, and is given 

t>y 

/»x«+Axft+/ e x(i+ •+/,xn 

fa +fb +fc + • +/« 


where f r is the frequency corresponding to the value r of 
the variable; thus, m Example 1, 200 is the frequency cor- 
responding to 2. ft we assume infinitesimal increments, the 
mean is given by 



where the limits of the integral will bo such as to cover the 
whole distribution. The moan could also be described as the 
position of the ordinate through the centre of gravity of the 
distribution (centroid vertical); this may be of help to some 
readers. 

8. The mode is the characteristic that occurs most fre- 
quently, in other words, it is the position of the maximum 
ordinate. We cannot toll from the rough statistics which 
ordinate is greatest and the mode can therefore only bo 
determined approximately until the law connecting the 
various groups, i.e. the f’roquoney-eurvo, is known 

9. Now since an equation or curve might be used for several 
distributions, one given according to age, a second of a different 
subject according to duration, a third according to sums 
assured, and so on, we must have a standard of reference based 
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on the distribution itself. Tor this purposed function known 
as the standard deviation is%used It is given by 

’ ll fa a '*+fl> b ' t +- ±M1 \ 

N I fa +fb + +fn / 

where a\ b\ n ' are the ^stances from the mean In the 
form of integrals the standard deviation is 

where x is measureckfrom the^mgan * 

The standard deviation measures the way the frequencies 
are distributed m terms of the umt of measurement As the 
frequencies farthest from the mean are multiplied by the 
largest values of x 3 a large standard deviation shows that the 
frequency distribution spreads out from the mean, while a 
small standard deviation shows that the frequency is closely 
concentrated about the mean In considering the relative 
sizes of standard deviations, it is necessary to bear m mmd 
the umt of measurement, because, if a given distribution is 
arranged m two series, first, according to years of age, and 
then m quinquennial age groups, the standard deviation will 
be five times as large in the latter case as it is in the former 
This can be seen at once by comparing the two expressions 

J | j /a* 2 dx 'I jfxdx | and J j jf x {5x)* dx j ' J/gCfarj 

The latter is obviously five times the former The values of 
the standard deviations are given m Table I for each case 
The diagram on p. 11 shows two curves having the same 
mean B and approximately the same area, but the dotted 
curve has the larger standard deviation because it spreads 
out more on each side of the mean. 

The reader will notice from the algebraic expressions given 
above that the mean, mode and standard deviation are not 
dependent on the number of cases (1 e on the absolute size 
of the curve), but merely on the way they are distributed 
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(i e. on the proportionate numbers or the shape of the cufve). 
The standard deviation measures flie <e spread” or “scatter” 
of the Statistics from the mean 

10. An examination of frequency distributions (sec Table T 
and pp. 5 and (5) shows that most of them start at zero; gra- 
dually rise to a maximum, and tfhon fall sometimes at a very 

* different rate If the rise and fall are at the same rate, distribu- . 
tion jyill be symmetrical about the moan, which must then 
coincide with the mode The difference between the mean and 
mode is therefore a function of the skewness or deviation from 
symmetry In order to get a satisfactory measure, the spread 
of the material must be taken into account, and this leads 
us to measure skewness by the distance between mean and 
mode divided by standard deviation If the mean is on the 
left-hand side of the mode when the statistics are plotted out 
in diagram, this function will be negative, and to rerficmbcr 
the sign it is convenient to write: 

Mean — Mode 
n kc w ness = . 

n II. 

The diagram on p. 1 1 will help to show the rationale of the 
measure for skewness. It gives two curves having the same 
mean B and the same mode A , but with different standard 
deviations, and it is clear that the dotted curve, with its larger 
standard deviation, is more nearly symmetrical than the other 
curve. 

11. We may summarise these functions by saying that the 
mean and mode fix the position of the curve on the axis; the 
standard deviation shows how the material is distributed about 
the mean, and the skewness shows the amount of the deviation 
from symmetry exhibited by the material. 

These preliminary definitions will be sufficient for our 
present purpose, but the functions defined will bo more easily 
understood when their actual connection with the practical 
work of curve-fitting has been studied A student working 
at the subject for the first time should plot out several distribu- 
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nuns on cross-ruieu paper, m order to familiarise himself with 
their nature and appearance He should calculate and insert 
/ the means m the*diagrams, but should not attempt to calculate 
standard deviations until he knows somethmg of the method 
of moments. 



12. Up to this point we have defined our statistics as 
frequencies, that is, as a number of cases grouped together 
as alike either because they are actually alike in the sense of 
Example V or because the statistics throw them up m com- 
paratively narrow groupings as m Examples II, III and IV 
When, however, we are tabulating our experience we have to 
deal with individual observations and they are grouped sub- 
sequently From this point of view if there are N observations 
we may call them o l9 o 2 , o 3 , . o N , where o x may stand for 

the first observation and may be (see Example III) one of 
the 200 existing m the 65-69 group It might be a case 
“existing” at age 66*12, and o 2 might be “existing” at age 
73*72, o 3 at 42 26, o 4 at 67*37 and so on Then the mean is 



CHAPTER III 


METHOD 0 P ‘MOMENTS'' 

1. ^Before we proceed to deal with suitable forms for use as 
frequency-curves, it will be well to see if s^tno method of 
applying them to statistical examples cw be found, for it is 
clearly useless to suggest a curve and have no way of using it 
We require, therefore, a general method by which a given 
formula can be fitted to a particular statistical experience, 
and may be applied to any expression (for instance, Makeham’s 
formula for the force of mortality) on which we may have 
decided as the basis of graduation. The first point to be noticed 
in searching for a method is that if there are n constants in the 
formula, we must form n equations between the formula and 
the statistics Thus, if wo have throe terms, say, y ~ 20, 40, 
and 88, when x = 1, 2, and 3 respectively, and wish to use the 
curve y = a + bx + cx 2 to describe them, wo can, of course, 
find values of a, b and c so that each item is exactly reproduced 
by equating as follows: 

a + b + c as 20 
a+26 + 2% « 40 
a + 36 + 3 2 e = 88 

But if we have a fourth term y = 96 when x as 4, and use the 
values of a , 6, and c found from the three equations just given, 
we should find that when x = 4, y = 164. This suggests that 
when there are more terms in the statistics than there are 
constants, the equations must be formed by using all the terms, 
not by selecting from them. The graduating curve will not 
necessarily reproduce exactly any of the observations, but 
will run evenly through the roughnesses of the observed facts 
so as to represent their general trend. 
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2m Let a l9 a 2 , a 3 , .. a n Ae n terms to be graduated, tlien, if 
the series were perfectly smooth and followed a known law, 
each term could be reproduced exactly by, say, b l9 b 2 fb Zf b n , 
where a x = b Xi a 2 = 6 2 , a z = 6 3 , . . and a n = b n . Now, if we 
consider the two series (the a’s and the 6 J §), we see that since 
each term is reproduced exactly 

n n n n 

2 a r = 2 an< ^ 2 = 2 c r^r 

r— 1 r=l r=l r=l 

where c« is a numerical coefficient. 

This suggests a possible method to apply when each term 
cannot be reproduced exactly The total of the graduated 
figures must be made equal to the total of the ungraduated, 
and the further equations necessary for finding the unknown 
constants must be formed by multiplying the various terms 
by different factors and similarly equating the sums of the 
graduated and ungraduated products, l e Zc r a r = Zc r b r . It 
still remains to decide the best form to be given to c r , and the 
mean bemg equal to 

q 1 + 2a 2 + . +na n 
a x +a 2 + +a n 

suggests that c T — r should give one reasonable equation. 
Again, since we shall have to use some function of r which, 
when apphed to the graduation formula, will give an mtegrable 
form (otherwise we cannot make an equation between S c r b r 
and Ec r a r ), the powers of r suggest themselves as convement 
when integration by parts is attempted. If, therefore, we write 
c r — r l and give t successively the values 0, 1, 2, . , we can 
obtain as many equations as we require, and from the first 
two of them we find, successively, the area and the mean, 
which will be the same m the graduated and ungraduated 
figures. 

This method is known as the Method of Moments (cf, 
moments of inertia), and experience has shown that it is a 
satisfactory method of fitting a curve to an actual statistical 
experience Confirmation on the theoretical side has been 
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produced, and whfle it is possible io invent other methods of 
fitting particular curves, as has |^een done by actuaries in 
connection with Makeham’s law of mortality, no better general 
method has been produced (see Appendix V, for note on 
other methods). r # 

3. Applying the method to strive the three equations given 
0 above, we have 

(a + 6 + c) + (a + 2 6 + 2* ,, c) + (a + 36 4* 3*c) = 20 + 40 + 88 
(a + 6 + c) + 2 (a + 26 + 2 2 c) + 3 (a + 36 + 3 2 c) - 20<+ 2 x 40 + 3 x 88 
(a + 6 4* c) + 2 2 (a 4" 26 4- 2 2 c) -j- 3 2 (a + r 36 4" 3"C) 20 4" 2 w x 40 4* 3 2 x 8£ 

or 3a + 66 + 14c=148 

6a +146 4- 36c = 364 
14a +366 + 98c = 972 

These equations will give the same result as those from which 
they wore formed, because each of the three terms 'can be 
graduated exactly, but if we introduce the fourth term, 
x = 4, y = 96, wo can modily the moment method by adding 
a fourth term to each equation given above and obtain 

4a + 1 06 + 30c — 244 
I ()a + 306+ 1 00c -- 748 
30a + 1006 + 354c * 2508 

The solution of these equations gives 

a » - 23 0 
6 = 42-6 

c = - 34) 


or 

a: = l 

y r= I (Hi 


x - 2 

y ~ D0-2 


x — 3 

y = 77 8 


x — 4 

y = 90-4- 


This is a very simple example, but it will probably help to show 
the way results are reached, and will serve as a foundation 
for what follows 
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4. « The nth. moment of i particular frequency is defined as 

the product of the frequency and the nth power of the distance 
of the frequency from the vertical about which moments are 
being taken, or the nth moment of &ny ordmatey of a frequency- 
curve about the vertical through a point (^stance x from it, is 
yx n , and the wth moment of *he whole distribution treated as 
a series of ordinates is y^x'l + 2 / 2^2 + > where 2/1 + 2/2 + .is 

the total frequency Thus, m Example V, the third moment 
of the frequency 40 for term 7 about the vertical through 
3 is 40 x ( + 4) 3 . ^ m 

5 . If the ordinates are kno^n, we can calculate the moment 
for them immediately by multiplying the frequencies by the 
powers of the distances between them and the vertical about 
which the moments are required and then adding the results, 
care being taken to give the distances their proper signs If 
areas a?e given, an approximation is made by assuming them 
to be concentrated about the ordinates at the middle points 
of the bases on which they stand The columns after the third 
m Table II show the calculation of the moments about the 
vertical through age 77 for Example IV of Table I, on the 
assumption that the frequencies are concentrated at the middle 
points of the bases. 

The unit of grouping has been taken as 5 years, and if, as is 
often convenient, we assume the total frequency to be unity, 
the totals will have to be divided by 1000 We should generally 
deal with the actual numbers that occur, but as they have 
been given 111 Table I as the distribution of 1000 cases, it will 
be better to use them m that way m the present case The 
numbers — 4, — 3 , m col ( 3 ) show the distances from age 77 
m terms of the unit of grouping The centre of any other group 
would have done almost as well as 77 , it is convenient to choose 
the arbitrary origin so that it is near the mean of the distribu- 
tion. This makes easier the calculation of the moments about 
the mean (a result frequently required), and enables the calcu- 
lator to get a rough check on these moments by comparing 
them with those about the arbitrary origin The cols. (4)-(7) 
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are sufficiently explained by thei/'headmgs; they are formed 
successively and cheeked by multiplying / by s 4 , the values 
of s 4 being taken from a table of the powers of the natural 
numbers 


Table, IT 


Central 
ago of 
gioup 

Frequency 

/ 

(x-77)/5 

—a 

f\s 

/ x «■» 

e 

! '< 

s*« x 

(i) 

(2) 

(8) 

(4) 

' («) 

'*(«) 

(7) 

57 

29 

- 4 

1 16r 

464 ^ 

1,856 

7,424 

62 

23 

-3 

69 

207 

621 

1,863 

67 

81 

-2 

162 

324 

648 

1,296 

72 

151 

-1 

151 

151 

151 

151 

77 

192 

0 

-498 


“3,276 


82 

239 

1 

239 

239 

239 

239 

87 

157 

2 

314 

028 

1,256 

2,512 

92 

93 

3 

270 

837 

2,511 

r 7,533 

97 

29 

1 

116 

461 

1,850 

7,424 

102 

6 

5 

30 

150 

750 

3,750 

Totals 

1,000 


J 978 

1 480 

3,461 

MI,(!12 

4 »,:s,io 

32,192 



Notation for Moments 




jV “total frequency 

v n wth unadjusted statistical moment about mean. 



M n 'wwth unadjusted statistical moment about any other point, 
u n - ath moment from curve about mean 


nth adjusted statistical moment about mean. 


1 p n ' - ath moment from curve about other point 


~?Hh adjusted statistical moment about other point. 

Notts v , /, p and p' always refer to a total frequency ot unity. 


The arithmetical work may be checked in other ways; for 
instance, instead of checking the final column by multiplying 
each term by the appropriate value of a; 4 , we can form a new 
column (*+ 1) 4 /, which is the same thing as 

x i f+ 4a: 3 / + 6a; 2 / + 4 xf+f 

The total of this new column can therefore be used to give a 
check on the multiplication and addition. In the numerical 

( 16 ) 



example (Table II) we sAmld have 29 x ( — 3) 4 , 23 x ( — 2) 4 , 
etc , 6 x 6 4 , the total of su$h a column is 69,240, which agrees 
. with the totals ef cols. (2)-(7) m the following way 

32,192 
4 x 3,3^6 = 13,344 
6 x 3,46*4 = 20,784 
4 x 480 = 1,920 
1,000 
69,240 


Helpful tables (Powers and JFourth moments) will be found 
m Tables for Statisticians and Biometncians edited by K. 
Pearson (Cambridge University Press) I shall m future refer 
to this book as Tables for Statisticians A student can manage 
without these volumes, but at some expense of trouble 

6. Ijj has so far been assumed that moments can be calcu- 
lated about any point, but it is frequently inconvenient to do 
so, for if we had required them about age 79 4, we should 
have had to multiply by the powers of (57 — 79 4)/5, of 
(62 — 79*4)/5 and so on, and it is quite clear that the labour 
would have been very great In such a case we can, however, 
take the moments about any other more convenient point, 
and then modify them m the following way 

Let the distance between A, about which the moments are 
known, and B y about which they are required, be 4- d, thus, 
if we want moments about 25 7 and have found them about 25, 
d is *7 , if we had found them about 26, d would have been — 3 
Then, if the distance of any ordinate y r from A is X r , and from 


B is x ry then 


-d and x ™ = ( X r — d) n 


Now, the nth. moment of the whole distribution treated as a 
series of ordinates is Xy r X 7 r l about A , and Ey r x f about B , so we 


have 


K = E Vr X r = Zy r {X T -d) n 


= Z[y r {X? - ndX?-' + . + (- 


<~ndv' n _ i- 


L &v' n _ a- 


• ( 1 ) 
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where v tt n is written for the nth. moment about B , and vfi the 
^th moment about A. ? 

Instead of (1) we may proceed as follows: * 

K = £?Jr X r = -%r(^r + d ) W ] 

= v n % + n dv'U + gj- - d*v n n „ 2 + ... 

- < = v^ndvl_ x - n ^hwU- (2) 

o ' 

There is little to choose between these two formulae, and of 

/T 

course they give identical results 

7. We will now apply them to work out the moments about 
the centroid vertical (i e. vertical through the mean) for the 
example m Table II The distance of the mean from any point is 

JL (A r j/ r ) •*-'( X rVr) 

where N is the total frequency , or we may say that the distance 
of the mean from any point is the Jirst moment of the distribu- 
tion about the vortical through that point It follows that the 
first moment about the centroid vortical is zero, so that if such 
moments are required the term involving v'[ m (2) is zero. 
When wc deal with frequency-curves wo shall see that we 
generally require moments about the centroid vortical and m 
designating thorn we shall leave out the dashes and use v. 

8. The arithmetical work is as follows: 

The totals in cols. (4)-(7) are divided by the number of 
observations (total of col (2)), and the quotients arc the 
moments (v f ) about 77. The moments are dealt with as having 
reference to a case whore unity is the total frequency, i.e 
proportional, not actual, frequencies are dealt with. 

v r x = *480 *4 » 3*464 

>4 = 3 336 >4 « 32*192 

The value of v[ gives the mean age = 77 + 5 x *480 = 79*4. In 
order to use formula (1) or (2), the value of d is required, and 
when the calculation of moments has to be made about the 
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centroid vertical its valuers, as we have seen above, the*same 
as v[ , in the present case it ^ the first moment about the vertical 
through age 77 The powers of d are next calculated by 
logarithms. As it happens d is a comparatively simple number, 
if it Jaad been *48327, say, the propriety # of usmg logarithms 

would have been more obvious 

% * 

d 2 = -2304 d 3 = + -110592 d* = -0530842 * 

Remembering that v 1 is zero and v' 0 and v 0 are each unity 

because they a]% the total frequency, we reach 

•> * 

v 2 = 3 2336 v z = - f-430976 v 4 = 30*416289 

It is wise to work to a large number of decimal places 
because, owing to the subtractions involved, calculations 
which began with, say, seven figures may end with only five 
It is well, therefore, to use a seven-place logarithm table 
(e g Chambers’s) and antilogarithm table (e g. Filipowski’s) 
or a multiplying machine 

It will be noticed that the terms required in the calculation 
of successive moments can be formed continuously. Thus, m 
formula (1), we require to calculate the following multiples 

1 2 ' 1 for the second moment 

3 d ,, third ,, 

„ fourth 

I have found it convenient to adopt a regular system m 
calculating moments, as in other statistical work, and create 
the habit of putting results and calculations m fixed positions, 
so that the arithmetic, which is sometimes comphcated, can be 
followed quickly and can be confirmed or rectified more easily ^ 

9. Although the above is the usual way to calculate 
moments, another method was suggested by the late Sir G-. F 
Hardy and used by him m his graduation of the British Offices 
Tables 1863-93 He pointed out that by summing the statis- 
tical numbers and forming a new series in the same way as 
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is done by actuaries when the N x ^olumn is formed fron? the 
D x column and then summing these results (of the 8 column), 
and so on, equations can be formed which givqthe same results 
as the method of moments. The arrangement in Table 111 * 
shows both the method of calculation and the form of the 
expression obtained by the process 

Considering the line opposite the first term, Vo notice that 
the sum of the series is given, and that the second summation, 
which we will call S 2 when the total frequency is taken as unity, 
gives the first moment of the whole distribution about a ver- 
tical situated at unit distance befoic the point corresponding 
to/(l). Still considering only the first line, we see that $ 3 
gives each function multiplied by coefficients of the form 

n(n+ 1) ri^ + n v 2 + v[ , ... r 

— ' or — — i e it gives , where v is written for 

the moment, because by definition the tth moment (j$ of the 
whole distribution is given by the sum of 7 for all values 

of 7?/ and give each function multiplied by * h - 

n 4 + (vri? -f 1 1 n* ~{ Cm 
and , . respectively, 

24 

The following equations result: 

* s a - K --- (\0)i + 

*% » + *'i) - u K h Vn i ■+ 1 1 J'j + Cm\) 

Those equations enable us to calculate the moments about 
the selected origin, but if it is necessary to find moments about 
the mean, the following relations are more convenient, they 
can be reached by substituting in the above the values m 
formula (2), and remembering that 8 2 » d. 

= 2 /Sjj — d?( l + d) 

= 6AS r 4 — 3i7 2 (l +d)~~d(l + d) (2 + d) 

1^4 = 24$ 5 — 2r a {2(l + d) + I) — rgjb( l *4* d) (2 -j- d) ~~ 1 1 

~d(l+d)(2 + d)(Z + d) 

10. Table IV shows the working m the numerical example 
already dealt with by the direct method The fifth sum is 
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unnecessary, as the total <$ the items in the fourth sum gives 
the only value required 


Table IV 


Frequency 

First ' 
sum 

f Second 
sum 

Third 

sum 

Fouith 

sum 

29 

1,000 

5,480 

19,372 

54,508 

23 

971 

4,480 

13,892 

35,136 

& 1 

948 

3,509 

9,412 

21,244 

151 * 

867 

2,561 

5,903 

11,832 

192 

* 716 

1,694 

3,342 

5,929 

239 

524 

• 978 

1,648 

2,587 

157 

285 

454 

670 

939 

93 

128 

169 

216 

269 

29 

35 

41 

47 

53 

6 

6 

6 

6 

6 

Total i 
(for check) 1 ’ 00 ° 

5,480 

19,372 

54,508 

132,503 


From the totals of the columns we have 
S 2 = d = 5-48 8 S = 19 372 S 4 = 54*508 and S 5 = 132*503 

The first value $ 2 or d shows that the mean is at age 
52 + 5 48x5 = 794 The age 52 is used because it is the centre 
of the group before that m which numbers occur, and, as has 
been already remarked, the summation method assumes the 
work to be done with reference to this position The apphcation 
of the formula for v 2 , v z and y 4 , given above, enables us to find 

y 2 =3*2336 p 3 = ~ 1*43099 zq = 30*4164 

11 . We may save arithmetical work in several ways when 
using the summation method If, instead of making all the 
calculations implied m Table III we stop at the sums next 
above the lines ruled in the various columns we shall have as 
the final totals 

If(n), Znf(n), >/(»); S ^ n ~^~ 2) f(n) 


and 


s i) (n-2) ( W, ~ 3 L (, 1 ) 
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4 r 

The formulae of § 9 for 8 S , S t and S- 0 in terms of v[, v! z etc. 
require modification only by altering the alternate signs from 
+ to - The form of moment given in this paragraph (§11) 
has been called “factorial moments” * A little further saving 
of work can be effected by taking the figures up to the totals 
t next below the lines ruled in Table III This £ives the same 
result as that just obtained if the origin is assumed to be shifted 
one space. It will suffice if we take this second ease for a 
numerical example and using the figures >from Table IV, 
we should work only so far a,s 4480 fdf the second sum, 
9412 for the third, 11,832 for the fourth and for the fifth 
we sum the last six entries in the fourth column and obtain 
9783 

12. These are the direct ways of using the summation 
method, but, as in the multiplication method of calculating 
moments, we can shorten the work by using a central term 
instead of the first term as the starting point or arbitrary 
origin.f We shall now use this arrangement with the“ factorial 
moment” form. A little care is necessary, because, though 
thero is no difficulty about the interpretation of the sums on 
the positive side of the selected point, the moments for the 
terms on the negative side assume that multiplications are 
made by the powers of negative quantities. Table IV (A) 
gives an oxamplo of summations that have to ho made. The 
figuro 978 is S«/(-a) for values ou the positive side of the 
arbitrary origin and 498 is the similar sum on the negative 
side, ignoring sign, say The mean is found by 

dividing the difference, £«/(«,) — !>«/( - m), by the total 
frequency, i.o. (978 -498)/ 1 000 = -48 Taking, now, the final 
figures m the columns headed “Third sum”, 070 represents 


* The semi-invariants (or half-invariants) used by Thiele and other writers 
can be obtained from moments The second and third semi-mvariants are the 
same as the second and third moments about the mean anti the fourth somi- 
mvanant is the fourth moment less three times the square of the second moment 

K-W) 

t I have to thank Mr G, J Lidstono for the suggestion of shortened sum- 
mations 
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S — f( n ) and 324 1 * the correspondmg figure on the 

negative side , Similarly with the other columns 

Now reverting to the expressions in §11, which relate only 
to positive summations, we can write * 

' v' = 25' + 5' 

v!, = QS' i+ 6S' 3 + S r 2 

. p' t = 24Sj + 365' + 145' + 5' 

where S' is used ftistead of to mdicate # the different system 
of summation. 

In Table IV (A) however we have divided the distribution 
into two parts, and in applying the expressions just given we 
must work out each part separately, adding the items for even 
moments and subtracting for odd moments Hence for the 
whole distribution 

v' 2 = (2 x -670 + *978) + (2 x *324+ *498) = 3-464 

v; = (6x -269 + 6 x *670 + *978) - (6 x *139 + 6 x *324 + *498) 

= 3*336 

y' = (24 x *059 + 36 x *269 + 1 4 x *670 + *978) 

+ (24 x *029 + 36 x *139 + 14 x *324 + *498) = 32*194 

and, transferring to the mean, 

v 2 = 3*2336, v 3 = - 1*43098 and v 4 = 30*41626 

We may now express the work m symbols. Writing P for 
summations on the positive side and N for those on the 
negative side of the arbitrary origin, we have 

v' % = (2P 3 + P 2 ) + (2N Z + N 2 ) 

= 2 (P 3 + N z ) + (P 2 + N z ) 

and similarly 

v’z » 6 (P 4 - N 4 ) + 6 (P, - N z ) + (P 2 - N 2 ) 

v 4 = 24(P 5 + W 5 ) + 36(P 4 + A 4 ) + l4:(P z J r N z ) + (P 2 + ^ 2 )* 
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13. A comparison of Table IV (A) with Table IV will show 
that a saving of numerical work is effected by using a central 
point as the starting point for the summation, for the sums are 
numerically smaller and the value of or d, which enters into 
the formulae on {>. 20, is much smaller. It will be readily 
appreciated that whenever there is a large number of terms the 
summation method, and especially the form of it given in 
Table IV (A), is an improvement on the product method of 
calculating moments. By means of an adding machine the 
summations can be obtained mechanically with little t rouble, 
oven for series containing as many as a hundred terms. 

14. In § 12 of Chapter II the mean was described alter- 
natively in terms of the individual observations, o v o 2 , ,, o N . 
Similarly the tth moment is 

1 N 

and the £th factorial moment is 

2f 2y(°/ - 1 ) • • • 0* - 1 + 1 ) = ^ ly? 

/ 15. It is now necessary to consider the calculation of 
moments from the curve, for until this has been done it is 
impossible to form equations for finding the constants. 
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Le*b y x —f(x, a, b , c, ),ftwhere a, b } c , . are constants to 

be determined 

. We liave seen? on pp. 13 and 14 } that one way of working 
m would be to find 

f(l,a,b,c , )xl n +J(2,a ) b,c i ^)x2» + 

% i 

say, 2 /(*> «> b, c, ) x x n 

X~ 1 

and this would give a result which might be used in forming 
equations if it wes^not for the fact that ltfis often impossible 
to find an algebraic expression for the sum of such a series m 
terms of the constants It is, however, generally possible to 
find such an expression for the integral, and as we have defined 
the rath moment of an ordinate y x as y x x n , the rath moment of 
the whole distribution from x = h to x = k is 

f y r x n dx or f f(x, a, b, c, . .)x n dx 
J h J h 

The total frequency (i e total number of cases investigated) 

is [' y,dx, and the mean is f y c xdx I f y t dx, as we have 

J h J h I J h 

already noticed. 

v 16. If the moments from the equation to the curve are 
calculated m this way and equated to the moments calculated 
from statistics by assuming that the latter consist of a senes 
of ordinates, an inaccuracy is introduced. 

Let us consider the two cases 

(1) When the statistics are a system of isolated terms or 
ordinates* and we wish to pass a curve very closely 
through them 

(2) When they are a system of areas but the moments are 
calculated by assuming the areas to be concentrated at 
the middle points of the bases. 

* Strictly speaking, not a frequency distribution but a senes of values 
requiring graduation Distributions have geneially to be dealt with as areas 
foi frequency- curve work because they tell the way the whole number of cases 
is divided m groups, and the wdiole area between the cuive and the axis of x 
must therofoie bo used 
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17. In case (1) above, the term^2/ 0 , y v y 2 , ... y n _ x are ^iven 

ri 1 

by the statistics^ and since J y x dx is approximately equal 


to ?/ 0 , it is simplest* to assume thai 


y x dx is given iyy the 


equation to the curve, and we have to find adjustments to 

n - 1 

counteract the errorf caused by equating T Xy x to 
- v- 0 

f 

Xy x dx The most practical way of overcoming the diffi- 

J-i e S' 

culty is by calculating the true area corresponding to the 
ordinates y 0 , y v y„-i by means of a quadrature formula 
(formula of approximate summation) Many formulae are 
well known, but for the present purpose it is convenient to 
have expressions winch give approximate values of an area 
in terms of ordinates lying both within and without the base 
on which the. area to be valued stands Symbolically, these 

formulae express j" y,.dx in terms of y j, ?/ 4 , y . u , ij Il j, etc , 
or ?/o> ?/i» y -i> y* v- o1 ' c 


I Let 

y,r, — « + te 4 c.t: 2 f dx :i 4- ne 4 

Pi C 6 

then | y x dx = a 4- , 2 + 8() 

and y 0 = a 

V-i + ?/i = 2(ffl4-«H-p) 
y_a 4- ?/ 2 = 2 (ft + 4 c + 10c) 

Now, assume the required integral can be equated to 

%o + k(y~ i + ?/i ) +■ 2 4" ?/a) 

substitute the values given just above and equate the coef- 

* It is generally possible to use these limits m case (1), but if other limits 
have to be taken, such as 0 to n, different quadrature formulae must bo used 
f Actuarial readers will notice that the error is analogous to that introduced 
by assuming (1 
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ficients of a , c and e respectively to 1, ^ and and we 
have 

Ji 2k -f- 21 =1 » 

2k + 8l = -± [ 

2k + 32l = -fo * 

\ 

The solution of these equations gives 

] } __ 517 8 7„ 308 nrir l 7 17 

10 ~ 5 7 6 0 ^ /u — 5Tb 0 cLIKl f' ~ ~ T7 60 , 

and we obtain ^ a 

ri ’ 

J = 5Tgo{51’ 7 %o+308(2/_ 1 + 2/ 1 )— 17(t/_ 2 + 2/ 2 )} .. (I) 

II. If 


y x — a + bx + cx 2 + dx 3 

j yj* - -A- {y-i + 22 ?/o +2/1} • • - (ii) 

in. if 

jq* ss3 a 4 * bx ■+■ cx 2 + dx 3 + ex 4 

J_ yJ x = 14 iu { 802 (:'/ 4 + ?/-* ) - 93 (2/ii + y-u ) + 1 1 (y^ + y-^)} 

.. (in) 

IV If 

y x = a + bx + cx 2 + dx 3 

J = A{ 27 2 /o+ 17 2 /i + % 2 - 2 / 3 } -(IV) 

18. We can now take the calculation of the moments, where 

rn-i 

y x dx is required in terms of y 0 , y v . 2/ ft _ x 

J-i 


Now 


rn-i ri r li rn-i 

yx dx =\ y x dx +\ Vx dx +- + Vx dx 
J-i J-i Ji Jn- li 


If formula (I) be applied it can be used for all the integrals 
on the right-hand side of this equation except the first two 
and the last two, and the values of these are given by (IV). 
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Summing the values obtained an<f writing (IV) with the de- 
nominator 5760, we obtain 

, , f 

y r dx - w Vo + *37 ly x + OGGOv/o + 55 37 ? / s 

+ 57 60(C/ 1 + ?/ 5 + + // ^ -o + V n s) + 5537//^ 

+ .. ..(V) 

which means that we can multiply 

the first and last ordinates by 5-7 0 0 ( ” r h 1220486), 
the second an*! last but one by 57 ljy^= *7588542), 
the third and last but twod>y ffot( = 1*1578125), 
the fourth and last but three by ffro" ( — *9612847), 

leave all the other ordinates unaltered, and work out the 
moments in the usual way from this modified senes of 
ordinates. If there are less than eight ordinates, another 
formula must be evolved. 

19. In the following table the original senes and the modi- 
fied one are set out m the first two columns, and in the other 
columns the calculations of the first four moments about the 
middle of the range by the direct method are shown: 


Tablk V 


V* 

Modified by 
formula (V) 
?// 



vJ ' ** 

'// 

?// ^ •<’’ 

51*81 

58 13 

1 

232 52 

930 08 

3,720 32 

14,881*28 

48 74 

33 19 

- 3 

99 57 

288*71 

866 13 

2,598 39 

35 42 

41 01 

~ 2 

82*02 

I(t( 01 

:t28-0H 

656*16 

27 80 

26 72 

l 

2(1 72 

26 72 

2b 72 

26 72 

20 42 

20 42 


- GO 83 


1,91 1 25 

,, 

13 79 

13 26 

4 l 

18 26 

13 26 

13*26 

13 26 

8 22 

9*52 

4 2 

19*0 1 

38 08 

76 16 

152*32 

4 29 

3 26 

4 3 

9 78 

29 3 1 

88 02 

261*06 

1 69 

1*90 

4" 4 

7 60 

30*40 

121*60 

486*40 

207 18 

207 41 


4~ 49 68 
-391 15 

1,520 68 

f 299 04 
“ 4,042*21 

19,078 59 


207-41 is then treated as the total frequency, and the 
moments for unit frequency (y/J would be obtained by dividing 
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— 391-15, 1520 63, etc by 207 41, and not by 207-18, which is 
not the 4 'total frequency 5 ’, but merely gives the uncorrected 

# sum of certain equidistant values • 

m 20 . The work can sometimes be simplified considerably, 
for if f?he values at the ends of the experience are very small and 
have a tendency to keep close to the axis of x before they 
finally vanish (l e if there is high contact, most actuarial 
functions l x) a x , D x , etc have high contact at the old age end 
of the table), then it is reasonable to suppose that ordinates 
before the first anK after the last exist, brft are insignificant 
m value. Thus the integral corresponding to the whole series 
of ordinates can be legitimately extended beyond the limits 

— | and n — \ previously used, because the additional area 
thus mtroduqed will be evanescent Now if the area be so 
extended, the effect will be that m equation (V) the significant 

* ordinates from y 0 to will all have the coefficient unity, 
and the ordinates with weighted coefficients will all vanish 

The practical result is, that if there is high contact at one 
end of the statistics the adjustment need only be made at the 
other end, while if there is high contact at both ends no adjust- 
ment is necessary 

Mathematically, high contact means that the first few 
differential coefficients vanish at the point of contact. The 
diagrams on pp 71 and 83 show high contact at both ends of 
the curves, and the diagram on p. 63 shows high contact at 
the longer durations 

21 . The second case m § 16, namely, that m which mid- 
ordmates are used instead of areas, may now be examined. 
By concentrating areas about the middle pointy of their 
bases, we assume that the distances by which the areas 
r i ru 

y x dx , y x dx , etc must be multiplied are the same as 

the distances from y 0 , y l9 etc , that is, the 2th moment from the 
statistics is 

r+i rn rn-± 

y x dxX l - M y x dx(X+iy+ . + y x dx{X J r n—l) i 

J -i Ji J n-li 
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and we require ( X + x) l y x dx where X is the distance of 

y 0 from the ordinate about which moments Are calculated. 

Applying formula (I) to each integral and collecting terms, r 
we reach as a gemral coefficient 

wVo{- + [5178W+308((A- iy+(A + l) 1 } 

-n{(h-2y + (h+2)'}]y+ ..} 

where h is written for X + x for simplification, or 

^{5760/^+240^- l)h‘~ 2 + 3t(t-l) (t/% {t-B)h i ~ i + . .} 

<? 

If t = 1 this becomes h 

„t=2 „ & 2 + tV 

„ t = 3 ,, h 3 + {h 

„t= 4 „ A 4 + 1/i 2 + gy 

It has already been noticed that if there is high contact, the 

value of (X + x) f ydx is found by using the unadjusted 

ordinates, that is, the second moment is given by a senes, 
the general term of which is /A/; the third by a senes, the 
general term of which is h?y, and so on; hence, if//, bo written 
for the true adjusted moment about the mean and v for the 
unadjusted moment, the relations between p and v are given by 

/*a + l'* = 

Ih + h J z + s',, = v 4 or fi A = iq - iiq + a 

The mean needs no adjustment, for if t = J the general term 
has the correct coefficient h, and the third moment has to be 
adjusted*by \ of the first moment, which is zero where the 
moments are taken about the mean,* In order to demonstrate 
the correction for the nth moment by the above method, 
a parabola of at least the nth order is necessary If we apply 
these adjustments to the moments found on p 10, for Example 
IV of Table I, we have = 3*1503, = “1*430976 and 

* These adjustments wore first given by W* F Sheppard m Proc Lond Math . 
Soc xxix, 353-80 See also Appendix I 
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=*28 82866 These adjustments are found to make an appre- 
ciable difference m the constants obtained from the moments, 
. especially when there is a small number of terms 

In other words, they allow for the grouping, and the lesson 
to be learnt is that a moderate amount of grouping saves work 
and, thanks to *>ur knowledge of the correct adjustments, does 
not mtroduce error m the circumstances described 

22 . The practical conclusions m the two preceding para- 
graphs as to the treatment of moments when there is high 
contact can be checked numerically. The equation to a curve 
with high contact at each end having been written down, we 
can work out the ordinates at equidistant points or the areas 
on equal bases and calculate the moments from the figures. 
From the equation to the curve we can also calculate the area 
and moments for the whole curve and it will be found that the 
corresponding figures agree. A good curve with which to make 
experiments m this way is “the normal curve of error” 
because the ordinates and areas are accurately tabulated m 
Tables for Statisticians , but anyone wishing to apply this sort 
of check is advised to wait until he has read a little about 
frequency-curves 

When there is not high contact at both ends of the curve, 
the adjustments become more difficult to value, suggestions 
have been made for finding the corrections, and this matter 
is further discussed m Appendix I, but a beginner is advised 
to avoid these refinements 

A student should calculate the moments for one or two 
distributions, and make the easier adjustments, he can also 
find the standard deviations of distributions, for the S.D. = ^ 2 , 
where the jlc 2 has been adjusted m accordance with the above 
rules In Examples III and IV of Table I there is clearly high 
contact, and m Example I the rough moment should be used. 
In Examples II and V there is more doubt, and m the calcula- 
tion of the moments for Example II (see p 60), no adjustment 
was made 

This advice is given not because adjustment is unnecessary, 
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but because a beginner can content himself with mastering 
the general idea and leave out some of the refinements until 
he has a little more experience Later on, whbn the methods of„ 
Appendix I aie examined, it will be seen that Sheppard’s r 
adjustments alone r- do not usually improve the rough moments 
when the distribution is abrupt r 

23. Before proceeding to deal with fitting moic complicated 
curves it is advisable to consider the application of the method 
of moments to a simple case, namely, when * 

* y = a 4 - bx + ex* + . . 

u *r 


Let the range be 21, and let the origin be at the middle point 
of the range, and m 0 stand foi the area and m n for the nth. 
moment of the whole distribution about the middle of the 


range. Then 


r\i 

(a + bx + + . . . ) a.*' dx 

J ~i 

I4 , / a cl* 

Vh H 1 , + :> + 


im<lsimil»rly>» a , H = ilxP " 1 (.J + 3 + 2i ' + 5 + • ) 

These equations show that the even moments give the 
constants a, c, e, etc., and the odd moments give the constants 
b, c!,/, etc. This is, of course, the result of using moments about 
the middle of the range, and makes the solution of thooquations 
less laborious than they would otherwise have been. The 
solution can also be simplified a little by writing 

1 m 3 , __ a cl* 

2 V P = 2a+l + 2,9 + :j + "' 
so that j ( .p m , 

2i m o = a + 3 + 5 + ‘ 

1 »(, a cP eP 

Wl* = $ + J + T + - ' 

1 m 4 a cP eP 

2V ~P ~ 5 + T + V + ‘ , 
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and* similarly 


1 m 1 bl dl 3 fl 5 1 
21 l 3 5 7 . 

1 m, bl dl 3 fl 5 

2 *^= 5 + T + V + 

J. ms _ 6 1 dfP (P 
21 l 3 ~ 7 + 9 + 11 + 

J 

The solution,of these equations gives the constants required, 
for example, % 

(i) if y = a + bx , we have 

1 


r 3 1m, 

4 - T Wv~l 

{ii ) ix y = a + bx + ex* 

3(3 5 m 9 

2l' m °~~2l ~Y 

y 3 1m, 

4 ’i ST 

15 f 1 3 m, 

° _ 4Z 2 ( 2V m ° + 2VP 


(lii) if y — a + bx + cx 1 + dx 3 


a = 


3(3 


42! 


TOn- 


5 m, 


_ }5/j> 

” ~4Z \2Z 
15 


m. 


m. 


c = 


d = 


4( 2 

35 

4Z 3 


~2l‘ m ° + 


2V l 2 
7 __ 

"a 

3 m 


2Z l* 


Z m x 5 m 3 

"t" rw 70 


2? I ^2Z Z 3 

The above results, which can easily be extended if it is 
wished, may now be applied to one or two numerical examples 
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24. As a first example, we shall graduate the statistics in 
Table V, § 19, for which the moments about the middle of the 
range have been calculated Taking the curve y = a + bx + cx~, % 
the following values from Table V will be required 


21 = 9 

O 

i-i 

- — ; 

11 

m 0 = 

207*41 

W-j = 

- 301*15 


1520 63 


Hence 


3(622 23 5 A 520 63 

i( 9 9 X (4 5) 2 , 


4 ( 9 

= 20 563 
3 1 

~ 4 5 X 9 X ' 
= - 6 4387 


391 15 
4 5 


15 (_ 207-41 3 

4(4-5)" | 9 + 9 X 

•36815 


520-03 

(4-5)3 


• 25. The best way to obtain the ordinates corresponding 

to this graduation is by calculating b + c the first difference, 
and 2c the second difference, from the middle term; their 
values are -0 0706 and 7363 respectively. Since second 
differences are constant, the work is done continuously, and 
is as follows: . ... 


52-208 
43-192 
34-913 
27-370 
20 563 
14-492 
9-157 
4-558 
•696 


A 

-9-016 

-8-279 

-7-543 

-0-807 

-0-071 

-5-335 

-4-599 

-3-862 
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T^hese graduated figures will be found to agree fairly well 
with those given m the first column of Table V. 

4v / 2 6 . As a further example the folio wmg statistics , taken from 
a paper by S H J W Allin ( J Inst Actu xxxix, 350), and 
givnfg the values of annuities to widov^s in pension funds 
according to the age of the member, may be considered: 


Age 

Value 

of 

annuity 

Modified 
* by 

formula (V) 
p. 28 
a' 

Distance 

from 

middle of 
range 
multiplied 
by 2* 
d 

a' xd 

, a'xd 2 

* 

a' x d z 

27 

21*20 

23 79 

-7 

166 53 

1165 71 

8159 97 

32 

19 91 

15 11 

-5 

75 55 

377 75 

1888 75 

37 

19 34 

22 40 

-3 

67 20 

201 60 

604 80 

42 

18 58 

17 86 

-1 

17 86 

17 86 

17 86 


* 



-327 14 


- 10671 38 

47 

16 74 

16 09 

+ 1 

16 09 

16 09 

16 09 

52 

15 69 

18 17 

+ 3 

54 51 

163 53 

490 59 

57 

14 70 

11 15 

+ 5 

55 75 

278 75 

1393 75 

62 

12 99 

14 58 

+ 7 

102 06 

714 42 

5000 94 



139 15 


+ 228 41 

2935 71 

+ 6901 37 





-98 73 


-3770 01 


In calculating the above moments it has been assumed that 
the figures to be graduated represent a system of ordinates, 
if they had represented a system of areas, the adjustment by 
formula (V) would have been unsuitable 

The alternative is to avoid the integral calculus and work 
out from the equation y = f(pc) the sum of the ordinates and 
the moments of the ordinates. In the particular case where 
f(x) = a + bx + cx*-^ this is practicable, but there are many 
expressions which, with their moments, can be integrated but 
do not lend themselves to finite summation We have therefore 
confined attention to the more general method. 

When there is an even number of terms the difficulty of 
calculating the moments about the middle of the range is that 
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the terms have to be multiplied by -5, 1-5, 2 5, etc., and if the 
series to be graduated contains only a few terms, it is best to 
deal with the distance d, in the way shown above, and then 
divide the totals by 2, 4 and 8, m order to obtain the first, 
second and third moments respectively In this way, we have 

l = 4 

m 0 = 13915 

m 1 = - 49-36 
m 2 = 733-93 /" 

m 3 = -471 25 

We will now fit the statistics with each of the three curves, 
the formulae for which have been given, and compare the 
resulting graduations 

(i) y = 17-394— l 157 /b 
( li) y = 17-633- M57:c— -045 la 3 
(lii) y = 17 633 — I • lOO.r — -045 l» 3 4 -0035a: 3 

The following table shows the graduations- 


Age 

Ungraduated 

(■) 

(<<) 

(in)’ 

27 

21 20 

21-.fl 

21 13 

21 13 

32 

19-91 

20 20 

20 24 

20 2.8 

37 

19 34 

10-111 

19-27 

19 31 

42 

18 58 

17 07 

18 20 

18 22 

47 

10 7*1 

10 82 

17 01 

17 02 

52 

15-09 

1/3 im 

15-80 

15 70 

57 

14 70 

1 4 00 

14 46 

14 43 

62 

12 99 

18 34 

13 03 

13*05 


Formulae (li) and (ui) are practically identical, and both are 
considerably closer to the original figures than (i). 
v/ 27. The results obtained so far may be summarised as 
follows: 

(1) The method of moments is a general method of finding 
the constants in a formula suitable to a particular 
statistical example, and it consists of equating the values 
of Zf{n) x n l (which is called the <th moment, and is 
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summed for all values of n that occur) to similar ex- 
pressions obtained from the graduation formula These 
latter expressions will be algebraic, &nd simultaneous 
equations have to be solved in order to find the arith- 
metical constants * 

(2) The moments from the statistics can be calculated by 
multiplying the frequencies by appropriate values of n l 

or by the summation method * 

% 

(3) If momentsdiave been obtained about any one vertical, 
they can be transferred to any other by the formulae 
in § 6 of this chapter 

(4) Since the moments from the graduation formula must 
generally be found by means of the integral calculus, 
while those from the statistics are found by summation, 
the latter have to be adjusted before the equations for 
obtaining the constants can be correctly formed The 
adjustments depend on whether the statistics are a 
system of ordinates or a system of areas, m the former 
case adjustment is made by equation (V), and in the 
latter by the formulae m § 21 if there is high contact at 
both ends of the curve 



CHAPTER IV 


PEARSON’S SYSTEM 01) 
FREQUENCY-CURVE'S 

r 

1 . When it becomes necessary m practical work to decide 
on a system of curves for describing frequency distributions, 
we have to bear m mind that 

(1) Any expression used must be a graduation formula, 
it must remove the roughness of the material, 

(2) There must not bo so many constants in the formula 
that wc require a great number of moments, for this 
means that the accuracy is reduced The higher the 
moment the more liable it is to error when deduced 
from ungraduated observations; this is dear* when we 
remember that the ends of the experiences are multiplied 
by the highest numbers and their powers, 

(3) There must be a systematic method of approaching 
frequency distributions. 

2 . Now, considering the more obvious characteristics of 
frequency distributions, we find they generally start at zero, 
rise to a maximum, and then fall sometimes at the same but 
often at a different rate At the ends of the distribution there 
is often high contact This means, mathematically, that a series 
of equations y =/(#), y = etc. must be chosen, so that 
in each equation of the series dyjdx = 0 for certain values of 
x , namely, at the maximum and at the end of the curve 
where there is contact with the axis of x, 
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The above suggests that dyjdx may be put equal to - 5 

then, if y = 0, dyjdx = 0, and there is, therefore, contact at 
one end of the curve, while if x = -a, dyjdx = 0, and we 
have'the maximum we require So long a§ jP(cu) is general the 
form assumed for dyjdx is extremely general and includes 
cases when dyjdx may not be zero when y is zero If F(x) is 
expanded by Maclaurin’s theorem m ascending powers of x, 
we have * 

dy = y(fl+a) 

dx b 0 J r b 1 x + b 2 x 2 + . . 


•(I) 


We shall return to this equation and show how it can be put in 
the form y = f(x), so as to express y as a direct function of x, 
and we shall see that we have obtained something more general 
than i^ implied at the beginning of this paragraph We shall 
obtain curves taking various widely different shapes As the 
matter has up to the present been approached from an experi- 
mental point of view, it will be interesting to see how equation 
(I) can be obtained up to the x 2 term m the denominator from 
elementary propositions m the theory of probabilities 

3. If p be the probability of an event happening and q the 
probability of its failing, then the probabilities of its happening 
once, twice, and so on out of n trials are given by the terms of 
the expansion ( p + q ) n , or if we have N cases, the terms of 
N(p + q) n give the frequency distribution of the N cases into 
n+ 1 groups. The binomial series does not represent nearly all 
the probabilities that arise, and another series that occurs is the 
hypergeometrical Thus the chances of getting r, r — 1, ,0 

black balls from a bag containing pn black and qn white balls 
when r balls are drawn, are given by the successive terms of the 
series 


pn(pn~ 1) . (pn — r+l) 
n(n — 1) .(n — r+l) 

^ rqn r(r— I) qn(qn— 1) ^ 

pn—r+i 2 1 (pn— r + l)(pn~ r + 2) 
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A numerical example may help to make clear the way the 
series arises . A bag contains seven balls, of which four are black 
and three white, then if three balls are drawn the probability 
that inn 

All will bo blade is 

/ . o o 
4.3 3 

Two will be black is - x oft 
7 6 5 6 1 

One will be black is x 3^2 


None will be black is --- - - -- 


The sum of these four expressions is unity The terms can be 
seen to agree with the series by putting n ~ 7, pn = 4, qn = 3, 
and r = 3. 

Other series may arise, but those given will bo sufficient for 
the present purpose, and we shall proceed to consider how they 
can be put in the form of equation (I). The inconvenience of the 
expressions as they now stand becomes fairly obvious when an 
attempt is made to calculate numerical values for a large num- 
ber of groups, and besides this, they are not continuous, while 
the statistics of practical work often arc 

Considering the hypergcometrical series, and remembering 

that the function required for equation (F) is j and that, 

as the series is discontinuous, finite differences must be used, 
we have 


Vx 


pn{pn — 1).. (p?i — r+1) r(r— 1) . (r — $+2) 
n(n— f) .,(n — r-H) * {x — L)! 

qn{qn~ 1) . (qn — x + 2) 

(pn — r+ i)(pn — r + 2) . (pn — r + x—l) 


^ Vx Vx+l Vx Vx 


Vx 


r — x + 1 qn — x-\- 1 


-1 


x pn—r+x 

for p + q=l 

x(pn-r+x) j 2+2 
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and 


Vx+i = UVx+l + y-r) 

% _ j f (r + 1) (gn+ 1) — a?[2(r + l)q-w(g' — y)] + 2a; 2 | 

2 ^*\ a:(pft-r + a;) / 

^2/x =fc> 2{(r+ 1) (gre+ l)-a;(ra + 2)} 

2/x+i (' r+l){<in+l)-x{2(r+l) + n(<i-p)}+2x* 

which may be put m the form of equation (I), 

* 

% 1 dy _ a + x 

ydx b^-^b-^x + b^x 2 

In this form the actuarial reader will naturally think of the 
force of mortality to proceed from the force of mortality, after 
changing its sign, to the “ number living” (l x ) m a life table is 
the same thing as to proceed from the formula just given to 
- a frequQjacy-curve. 

4. Returning to equation (I), we see that it can be written 
m the form 

(h 0 + b 1 x + b 2 x 2 + )^| = y{z + a) 


multiplying each side by x n , and integrating with respect to x, 
we have 


J^(6 0 + 6i^ + 6 2 ^ 2 + . = jy(x + a)x n dx 


Integrate the left-hand side by parts treating dyjdx as one 
part, and the right-hand side as the sum of two functions, and 
then 

x n (b 0 + b x x + & 2 x 2 + . ) y - ^{nb 0 x n ~ x + (n -f 1 ) b x x n 

+ {n + 2)b 2 x n+lj r }ydx 

= jyx n+1 dx + jyax n dx 

or, if at the ends of the range of the curve the expression 
x n (b 0 + b x x 4- b 2 x 2 + ) y vamshes, we have 

— W&o/C-l — (w + 1 ) &i M'n — ( n + 2) ” • = ftn+l + a ftn 
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where we use the notation we have already adopted, namely, 
* K - jyx"dx 

If we put n - 0, ,1 , 2, . s respectively, we get s + 1 equations 
to enable us to find a, b 0 , b v etc., in terms rt f the moments 
(/{') as shown by the following equations, which have been 
obtained by writing the equation in the form 

e 

+ + . = — /Cfl 

and then putting n = 0, 1, 2, etc 

a/4 + 0 x b 0 + b 1 // > ' 0 + 2 b 2 fi[ + = -/4' 

«/*i + b 0 /i' 0 + 2b 1 // 1 + Zb 2 y/ 2 +. = -7*2. ^ 

a/4 + 26 0 /4 + 3 ^i/4 + 46 2 /4 + * * = -/4 
a/4 + 3/4/4 4- ^3 /4 "h ^2/4 ”f = — /4v «» 

etc , etc. 


Let us now make /4 = 0, and altei the other moments in 
the way indicated in Chapter II T, for the result of making 
/4 = 0 is to change the origin of the system to the mean of the 
distribution We can also treat /4 as 1, and these simplifica- 
tions lead to the following results: 

(1) Keeping b 0 only, we have 

1 dy x 

y dx //< 2 

(2) Keeping 6 0 and b v the first three equations in the 
system (II) above give 

a + b x = 0 

b 0 = /^2 

and a/i 2 + 3 h x [i 2 = - 



and 


Jh 

2 /^2 
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and the differential equation becomes 


1 dy 
ydx 


Uj‘> 

X + -P- 
2 !H 

fig 


(3) Keeping 6 0 , b l7 b 2 , the system gives 


a + b 1 — 0 


60 + 362/^2 — ~~ M* 2 

♦ 

Ojfi 2 + 36 ly ^ 2 + 46 2 fig = fig 


afig + 3& 0 /q, + 46 ly a 3 + 56 2 /^ 4 = - H 


The solution of these simultaneous equations is perfectly 
straightforward, and leads to 


1 dy _ 

V dx /x 2 (4 /x 2 /x 4 — 3 /x 3 2 ) 

10^ 2 ^ 4 - 18/x 2 3 - 12/x 3 2 


y , Ma* 4 + W) 

^10^4-18^^12^ 

, ^3 (^4 + ^2 2 ) , 2/X 2 ^4 — 3ft 3 2 — 6^t 2 3 ^2 

+ lO/x^ - 18^ 2 3 - 12/t** * + IO/X2/A4 - 18/x s 3 - 12/t a a 


In this last form put /? x = and /? 2 = ~ and 

fi 2 fi 2 


1# = Z j2(y a -6A -9) 

y dx / t 2 (4y? 2 - 3A) + V/i 2 VA (/?2 + 3)*+(2/? 2 -3&-6)a: 2 


2(5A-6/? 1 -9) 


.. (Ill) 


5. The reasoning by which equation (I) was first obtamed 
showed that a is the distance between the origin and the mode, 
or as the origin has now been transferred to the mean by puttmg 
/tj = 0 5 a is the distance between the mean and the mode. 
This distance m terms of the moments is, therefore, 

& VA(A + 3) 

2(5& -«A -9) 
where cr is the standard deviation ^ 2 . 
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Since the skewness is the distance between the mean and 
mode divided by the standard deviation 


Skewness = 


VA(A+ 3) 
2(5/? a -6y? 1 -9) 


* 


6 . It would be possible to obtain constants in^the differential 
equation (I) by using a greater number of terms and retaining 
6 3j iq, etc , but there are strong practical objections to such 
a course Besides the increase in arithmetical work, the gam 
m introducing additional constants is small because the higher 
moments become untrustworthy, as we have already noticed 
Karl Pearson has shown* that “we might easily on a random 
sample reach a 7th or 8th moment having half or double the 
value it actually has in the general population Constants 
based on« these high moments will be practically idle. They 
may enable us to describe closely an individual random^ample, * 
but no safe argument can be drawn from this individual 
sample as to the general population at large, at any rate so 
far as the argument is based on the constants depending on 
these high moments ” In some actuarial statistics where there 
are as many as 100,000 cases, it might be worth while to go 
as far as the next term of the series, but even here the value 
of the work is discounted because any other smaller body of 
statistics on the same subject could not bo compared satis- 
factorily with the result. For practical purposes it is probable 
that the equation taken as far as h 2 will bo sufficient, and we 
shall confine our attention to the forms thus obtained. 

7. Turning to the particular form of equation (I) given in 
equation (III) it will be seen that it is possible to obtain a 
formula representing the statistics by inserting in that equa- 
tion the values of the moments found from the statistics, but 
this would not give a graduation in the same form as that in 
which the original data appeared, for m the latter we have y, 

while the former gives or — It would, therefore, 

° ydz ax 

* “Skew coirelation and non-lme&r regression”, Drapers' Company lies Mem 
1905, p 9 See also Chapter X. 
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be necessary to integrate the expression we obtain m order to 
get terms comparable with the original data, and it is better in 
^ practical work to deal with the equations m the forms m which 
we require them for comparison, rather than by using the 
differential equation and then integrating the result The latter 
method could tmly give proportional not actual frequencies. 
8 . The next step is, therefore, to replace the equation 

• dl°gy x + a 

dx bQ + b^ + b^x 2 

X ~1“ CH 

by one of the form y — f(x), and to do this 7 = ; — 5 must 

6 0 + 6 1 a;-t-6 2 :z 2 

be integrated 

Let us consider equation (III) as a general expression for 
integration, then we notice that the form the integral takes 
depends on the particular values of the coefficients of x m the 
denominator The problem is, in fact, merely a consideration 
of the forms taken by the denominator for 


b 0 + b x x + b^x 2 


\ x-- 


f 46 q6 2 ) 
26, 


Zh-Mz. 4 W 

26, 


and the criterion for fixing the form m a particular case is, 
obviously, the same as that for the nature of the roots of the 
equation 6 0 + 6 1 aj + 6 2 a; 2 = 0, viz. 6f/(46 0 6 2 ), which, by sub- 
stituting from formula (III), gives 


Ai (/?2 + 3) 2 

4(2/? a — 3/^ — 6) (4/? 2 -3A) 


• -(IV) 


9. If expression (IV) is negative the roots are real and of 
different sign, and we get one of the mam types of curve — 
called Type I by Karl Pearson, to whom this system of curves 
is due , if expression (IV) is positive and less than umty the 
roots are complex, and we get the second mam type (Pearson’s 
Type IV), and if expression (IV) is positive and greater than 
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unity the roots are real and of the same sign, and we reach the 
third mam type (Pearson’s Type VI) 

This really coders the whole field, but m the limiting cases^ 
when one type changes into another we reach simpler forms of 
transition curves r Thus when the critenon is large (theoreti- 
cally infinite) one root is oo (Type III), whewfit is unity the 
two roots are equal (Type V), and when it is zero the roots are 
equal m magnitude but of opposite sign (Type II) If m the 
last* case b 1 = 6 a = 0, we reach what we shall call the “normal 
curve of error”, this name is open to some objection just as 
are the other names given fo it (e g Probability curve, 
Gaussian curve, etc ) Then again the expression for (d log y)/dx 
may be reducible to the form a'/^ + b^x) and we have a 
binomial or a straight line for the frequency-curve (cf Types 
VIII, IX and XI), while if the expression reduces to a constant 
the curve is the ordinary geometrical pi egression which we~ 
are pleased to find as a special case oi a system of frequency- 
curves because we are already familiar with it m the theory of 
probability m connection with sequences from coin tossing, 
etc As we proceed we shall find that in certain circumstances 
the curves may be J -shaped or even U-shaped, with limits of 
a single ordinate or two separated ordinates A diagram at 
the end of the book will give the reader an idea of the variety 
of shapes taken by the curves evolved from the formula 

d log y __ x + a 

dx ~ + + 

In practice we shall require the equations to the various kinds 
of frequency-curve, and we shall also want to know which 
type should be used m a particular case. We cannot usually 
guess the type from the appearance of the rough data and 
need an arithmetical test 

10 . We will deal first with the equations to the frequency- 
curves, that is, with the actual integration, and begin with the 
three main types. 

F%rst Ma%n Type ( Pearson's Type I) The factors m the 
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denominator, when the roots of b 0 + b 1 x J i-b 2 % 2 = 0 are real 
and of different signs, take the form 


> x — 

% 


— b 1 + ,/a positive quantity - ! 


26 , 


J , 

-b 1 — *Ja, positive quantity”) 

w 2 J 

and the expression to be integrated is therefore of the forxa 


jj£ — - 


x + a 


1 A 1 — a 


b 2 (x + A-±) ( x — A 2 ) b 2 A} + A 2 x -f A± b 2 A^ + A 2 x — A 2 
by partial fractions 

The integration is now simple, and gives 

+ a constant 


1 An “h CL 


1_ Ai—a 1_ At +a 

y = y'^x + A -^ a Ax+A^x— A 2 ) b * 


where y' results from the constant introduced by integration 
If the origin is now transferred to the mode (1 e put x for 
x + a), we have 


y = y 0 



where m 1 /a l — m 2 ja 2 

Second Main Type ( Pearson's Type IV) If the roots of the 
equation bQ + b x x + b 2 x 2 = 0 are complex, it is impossible to 
throw the denominator into real factors, and when this occurs 
we have to mtegrate by putting the expression on the right- 
hand side of the fundamental differential equation m the form 


X + c 

b 2 (X 2 + A 2 ) 

, -cr W J A* b Q b\ 

where X = x+^, c = a-^ and ^ = 
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c 


Then logy 


X + c 


or 


& 2 (Z 2 + Z 2 ) 
X 


dX 
dX + 


■/. 

= dX+ J&^+Z 2 ) dx 

= ~ log(Z 2 +A 2, ) + -~ tan - l “+' constant 

(>2 Zx 

?/ = ?/'(Z 2 + ^3)]i/26 a tan-i 


y = 


y 0 


( /y»2\ 771 

I+ ? 


-i>tan~ 1 .£/a 


where a has a meaning different from that implied m equation 
(I). The relation between this type and Type I can be seen by 
factorising the denominator of the right-hand side of the 
differential equation, b»(X — iA ) (x + iA), and then obtaining, 
an expression for y having the same form as Type I, but 
containing complex expressions 

Third Main Type ( Pearson & Type VI). The factorising is 
the same as Type I, but the roots of the equation being of 
like sign, the factors of the denominator take the form 
(x + A x ) (x + A%) The work is then the same, but at the end the 
origin is put by Pearson not at the mode but so that one of the 
expressions x + A x or x-\- A % can bo written as x. The form is 

^ ien y = y 0 (x — a) m 'x ~ m a 

11. We may now set out a few of the transition types 
Pearson’s Type II is the same as his Type 1 when a x = a z 
Type III . This type is reached when the criterion is oo, 
which happens when 6 2 = 0. 


log 


x + a 


+ b x x 

= f/1 a -J>o IK \ dx 

J hi h x + h) 

x [ b \ \ 

— •£- + p — ~ j y log(6 1 x + b 0 ) + constant 
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and y = y'e x/b ^(b 1 x + b Q ) («-V&i)/&i 

| or, by changing the origin, 


y = 2 / 0 ^' 


-ya: 


nr 


where a has a meaning different from that imphed m equation 
(I). This type can be seen to be a particular case of Type I 
when a 2 becomes infinite 

Type V In this case, when the roots are real and equal, 
x + a 


I°g!/ = Ji 

■J: 

- -J: 


1 (x +b 1 /2b 2 )+ (a -b 1 /2b 2 ) 


(x + bJ'Zbz) 2 


dx 


dx 


) + Si 


a — b x \2b 


2 dx 


b 2 (x + b 1 /2b z ) J b 2 {x -f bJ2b 2 ) 2 

- i Iog(* + W - J~ + ^ a) + constant 

1 G& 6l/2&2 

y = y'(x J rb 1 j2b 2 ) hi e Hx+bj2b 2 ) 

= y 0 x~ p e~v/ x 

Normal Curve of Error Putting 
b 1 = b 2 = 0 
p# -j- & 


log?/ = J- 




a# 


= 7 T 7 - + -T- + constant 
26 0 6 0 

constant 

2b 0 

• y — y r g(^+^) 2 /2&o 

or, by changmg the origin and altermg the constant, 

y = yo e ~ x2,c 
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In a similar way the other less important transition curves 
can be obtained. These are 


/ - x\~ m / x\ m 

\ 1+ a) ’ ( I + ») ' 


1 


-xjcr 


and we reach J -shaped curves when m Type I pith or m l or m 2 
is negative and U-shaped curves when both are negative 
J -shaped curves can also be obtained with Type III. 

12b A table is inserted which gives the list of curves with 
Pearson’s numbermg and with the origin as he generally uses it. 
This is convenient because m reading other work on the subject 
it will be found that Pearson’s numbering, etc. is usually 
adopted I have, however, added a note of the equation to 
each curve when the origin is at the mean. There is something 
to be said for uniformity as regards the origin, and the mean 
is convenient because all distributions have means and tho- 
moments are worked out about the vortical through the mean. 
A column in the table gives criteria to show which curve should 
be used in an individual case 

We may here deal with a little difficulty that students some- 
times encounter in connection with types which may be 
expressed in the same algebraic form (e g. Types VI 11, IX 
and XI can all be written hx k ) The question may be asked 
why we should not fit hx h from a to b and find A, k\ a and b from 
the equations for the moments The answer is that the criteria 
afford in effect a simplification of the equations and auto- 
matically tell us a good deal about the value of the constants 
and the range of the curve. 

13. We shall return to some of the technical points when 
discussing numerical examples in the next chapter but may 
now recapitulate the method, and see the steps that have to 
be taken to fit a frequency- curve to statistics. 


(1) Arrange the statistics in sequence, 

(2) Calculate the moments about a convenient vertical 

(3) Transfer the moments to the centroid vertical (vertical 
through the mean). 
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(4) If there is high contact at both ends of the curve, 

apply Sheppard's adjustments to the moments (1 e 
dedi^ci ^ and -|v 2 — from the sepond and fourth 
moments respectively). If there is not high contact, see 
Appendix I # 

(5) Calculate the criterion 

(6) By means of Table VI decide which curve should be 
* used. 

* 

As an alternative to (5) and (6), the curve to be used can be 
found from diagrams m Tabtes for Statisticians , which show 
the type m terms of ^ and /? 2 . 



CHAPTER V 


CALCULATION 

1 . The next point to be considered is the calculation of the 
constants for any particular distribution, when the moments 
have been calculated and the type to be used has been decided. 
The formulae required for the numerical work will be given for 
each type, a numerical example, including the calculation of 
the graduated figures, will follow, with the proofs of the 
formulae, 

2. Some general points relating to the calculation of the 
curves when the constants have been found may bo con- 
veniently considered hoio. When the constants are known, 
we can calculate the ordinate lor any value of x by substituting 
that value m the expression for the frequency-curve, and if 
areas are required, some method of proceeding from ordinates 
to areas must be found The most simple is probably to calcu- 
late mid-ordmates, and then by the quadiature formula (1) 
or (II) on p. 27 find the areas. It is occasionally more con- 
venient to calculate the ordinates at the beginning of each 
group, and then formula (111) should be used These formulae 
can be best applied m the form of differences, thus, from (II) 
we have 

J Vx dx = vo-M^y-i-Ayo} 

from (I) 

f_y* dx = Vo - -SiioiAy- 1 - + r>feo{^-2 - A'UxS 

from (III) 

JWcfcc = i{2/j + y-i) + tIIo -Ayv - ~ A Vu} 
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Formula (II) is generally sufficiently accurate, while the others 
will be found to^give a result true to five figures m ordinary 
cases — exceptional cases will be referred to*m the numerical 
examples that follow. 

3. It is sometimes a help to see the graduation expressed 
graphically, and this has been done with some of the examples 
The best method is to insert a vertical height y 0 at the mode, 
note the ends of the curve, and the heights of the ordinates that 
have been calculated These heights give points on the curve, 
which can be drawn through them fairly easily In drawing the 
curve, as well as m calculating the constants, the sign of the 
skewness must be borne m mind, for it is possible to draw the 
curve with the skewness on the wrong side of the mode, and if 
the distribution is nearly symmetrical, it is not so easy to 
notice the mistake as one might expect The tangent to the 
curve Sut the mode is parallel to the axis of x except m the case 
of the J -shaped curves or some of the less common transition 
types 

4. It is best to draw on a rather large scale in order to gam 
distinctness, and the curves given here were drawn larger than 
their present size, the reduction being, of course, made in the 
process of reproduction 

The base elements should also be fairly large m proportion to 
the height, so that the curve may not ascend too steeply; 
otherwise small horizontal differences between the graduated 
and ungraduated curves are apt to conceal large vertical 
differences when the curve is rising or falling rapidly, but it 
is the latter differences that are of importance. 

5. The reader should notice that all the cases considered 
in the following pages assume complete distributions, and it 
is in general only possible to find the curve from part of a 
distribution by means of successive approximation which is 
extremely laborious Another point, to which reference will 
agam be made, is with regard to grouping statistics, it is 
sometimes impossible to obtain many groups, but for accuracy 
in finding moments the greater the number of groups the 
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better, unless the total number of cases is small A little 
discretion is needed in this respect, but in actuarial statistics 
which are sometimes based on as many as 200,000 cases, 
seventy groups might be used for great accuracy Ip. our 
examples we have- grouped merely to save work, space and 
printing, and the grouping does not alter the method 

If there is high contact so that we know the proper adjust- 
ments, grouping leads to little or no error An adjustment of 
one-twelfth to the second moment when ten ages are grouped 
and used as the unit has much more effect, proportionately, 
than when only five ages are grouped or when individual 
ages are used The fear sometimes expressed that grouping 
destroys accuracy has no proper foundation m such cases, 
a little numerical evidence on this point will be found m 
Appendix I 

6. Another matter with which it seems advisable to deal 
here is connected with the criterion, at. Tins may have any 
value from — oo to -j-oo, and from the following diagram it 
will be seen how the types cover all the possible values of the 
criterion and do not overlap. 



- on h 

0 K 

1 K - 


k nogativo 

*>0and<l j 

*>1 | 


Typo 1 

Typo IV 

i 

< 


Typo TIE Normal cwrvo Typo V Typo til 

when ~ 8 
Typo 11 
when $ a ={=3 

Just before k = 0, Type T becomes nearly symmetrical, and 
after that value is passed we have a skew curve of unlimited 
range, and so on. At each critical point there arc one or more 
“transition” curves If by a mistake a student should use 
the wrong mam type, he will find his mistake by reaching an 
imaginary quantity in one of the square roots which occur m 
the equations for the constants, but transition types can be 
used when the values of the criterion approximate to the 
theoretical values, they can, m fact, be viewed as approxima- 
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tions which give an accurate result in a limiting case. It is 
impossible to give exact limits within which we are justified 
m usmg a transition type; theoretically, as shall see later, 
the justification depends on the size of the standard error of 
the function dealt with, but m practice can be guided to 
a great extent by the size of the experience, if there are few 
cases, a larger deviation m the criterion will arise than if there 
are many Individual cases must be considered on their 
merits, but if the student finds himself m doubt he can gfvoid 
using the transition type and be on the safe side m the matter 
of accuracy The student has one safe guide m every case, 
namely, that “the proof of the pudding is in the eating” 
He should try transition curves in a few cases where he has 
little hope of their applicability and compare the results with 
those obtained by the right main types and he will then learn 
ffluctTabout both classes of curves 

7. In the formulae that are given for the various types, the 

choice of sign for a square root depends on the sign of If 
the frequency is concentrated more closely before the mean 
than after it, the mode is on the left-hand side of the mean and 
fi z is positive, the signs of certain constants m each type must 
therefore depend on the sign of fi s in order that the mode and 
mean may lie in their correct relative positions. Where, 
however, no remark is made as to the sign of the expression in 
which a square root is given the positive root is implied, and the 
reader will find that these rules become easier to follow when 
he has worked but two examples, one giving a positive and the 
other a negative value for pc s . Thus, if we imagine the frequen- 
cies in the example for Type I to be written in the opposite 
order 1, 3, 7, 13, etc , all the numerical work would be the 
same, but would be 2-776978, m 2 — 409833, = 13 52728, 

and a 2 — 1*99638, and the graduation would be the same, but 
the numbers m the columns of the table on p 62 would run 
in the opposite order 

8. The arithmetical work is heavy and in some respects 
unfamiliar to most students There is no royal road to success 
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in it except care, system and the use of common-sense at the 
final stage It is irritating at the end of a lengthy piece of 
arithmetic to find a slip at an early stage and to have to re- 
calculate, but these slips become fewer and of less importance 
with experience, for when we are m practice we suspect alarge 
error immediately an erroneous value is readied Personally 
I use seven-figure logarithms as a rule and put a check on 
every step, although not necessaiily to the last figure This 
plan-was followed with the arithmetical work in this book 
The check might not disclose a slip which did not affect the 
graduation or only affected the "final figures of a constant or 
coefficient Thus if the last three figures of logy 0 on p. 61 were 
wrong (which I have no reason to suppose) the mistake would 
be regrettable, but the graduation m the table on the following 
page would be unaffected Moreover, difficulty may be found 
m reproducing exactly the numerical result of another <salctr- 
lator, owing to the usual unreliability of the end figures when 
many operations have been made. In lengthy arithmetic the 
two final figures may be unreliable and two arithmetical 
processes may both be correct and yet give divergencies This 
does not mean that five-figure logarithms arc as good as seven, 
for if seven figures give five figures accurately, we assume that 
generally speaking five figure work will only bo reliable to 
three figures. 
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FORMULAE FOR MOMENTS 

- t 

THES*E FORMULAE APPLY TO ALL THE 
TYPES OF CURVES 


v[ = d 
v 2 = v' 2 -d 2 
v 3 ~ v 3 ~ 3dv 2 — d 3 
v i = v' i — 4dv z — Qd 2 v. z — d i 
ot S 2 = d 

’*’*’2 =-2$ 3 — c£(l + <2) 


K = d 

v 2 = v’ 2 -d* 
v z = v 3 - Sdv'z + 

<v± = - 4dv'z + ®d 2 v 2 — 3 cZ 4 


v z = 6£ 4 - Zv 2 ( l+d)-d(l + d)(2 + d) 

v 4 = 24& 5 — 2v s {2(l + d) + 1} — v 2 {b( 1 + d) (2 + d) — 1} 

-d{I+d){2 + d)(Z + d) 

Sheppard’s adjustments when the 
curve has high contact at both ends 

or (standard deviation) = *J/i 2 

fir = A Im\ * 

fit = p*IA 

/?i(A+3) 2 

4(4/? 2 -3/? 1 )(2/? 2 - 3/^-6) 


^2 = ^ 2 -A 

/*3 = ^3 

Ai = ^d-i^ + 'srTr 
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FIRSj|T MAIN TYPE (TYPE I) 


/, , x\ mi / 1 x\ M * 

»-*{ 1+ ^l KJ 


where 


m x \a x — m 2 /c & 2 
Origm at mode 


The values to be calculated m order are 


r = 6(/? a -A- l)/(6 + 3/? 1 -2/? 2 ) 
fl ! + « a = i y//*a Vf/ ;> J (r + 2) a + 1 f 1 )) 
The «t\s are given by 

2 ( r - 2 ± r ( r + 2 ) J t '/J x y . + 2)/+ fd(r+ l) 

when // 3 is positive w 2 is the positive root 


iV m x m i J \m x + + 2 ) 

% + ’ {ni x + m 2 ) Wl 1 / ’(w? , + I ) i + 1 ) 


Mode- Mean — f 

2 /i 2 r-2 __ 

If expressing curve with origin at mean (see Table VI facing 
p. 51) 

A x "H A 2 ^ ^ “j* <7^2 

(m 1 +l)/A 1 = (m 2 + 1 )/A 2 

_ N (m x + l)"'t(m 2 + 1) ? " 2 / 1 (m 1 + m 2 + 2) 

Aj + A z ' (m 1 + m 2 + 2)"*i +m » ' J t (to 1 + 1 ) /’(m 2 + 1 ) 

For table of JT functions see p 266, or Tables for Statisticians 
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NOTES 


The usual shape of the curve is like that of the following 
example, but if m 1 and m 2 are approximate equal it is nearly 
symmetrical, ff m x and m 2 are not small it tails off at both ends, 
and if both m x and m 2 are small it rises abruptly at both ends. 
If m 1 is negative the curve is J -shaped, it starts at an infinite 
ordinate, falls -rapidly and runs out at a fixed point* (for 
numerical example see p 126). If both m x and m 2 are negative, 
the curve is U-shaped, starting and ending with infinite 
ordinates and having an anti-mode instead of a mode as the 
usual origin (for numerical example see p. 112). In the J- and 
U-shaped curves, though the ordinate is infinite, the area is 
finite Care is needed m these cases when taking out the F 
fflTlctton for F(t) is required when t < 1 and the tables give 
log r(l + 1), i.e , log t + log F(t). In the case of U-shaped curves 
it is best to use the form with origin at the mean or express 
the curve in the form y'x m ^(a x + a 2 — x ) m 2 with the ongm at the 
start of the curve and 

, N r(m x + m 2 + 2) 

^ ~ (a x + a 2 ) m i +m 2 + 1 F(m x + 1 ) F(m 2 + 1 ) 

An interesting variant of the J -shaped curve arises when m x 
and m 2 are both arithmetically less than umty and one of them 
is negative. The shape is then like that of No (11) in the 
diagram of curves at the end of the book, i.e it is of twisted 
J -shape (for example and further notes, see pp 1 1 1—3) 

EXAMPLE 

As an example of this type the figures in Table I (Example II) 
may be used The moments were first found by the summation 
method (see Chapter III, § 9) as shown m the following table 
The reader can check the result by recalculating the moments 
by the more direct method, taking age 42 as the arbitrary 
origin. This is how I should myself usually do the work; I only 
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use the summation method when the series is a very long one, 
and I give it here merely by way of example v 


Central 

age 

ol 

gioup 

Exposed 
to risk 
Example 3 T ( 
of Table I 

Fust 

Hum 

Second 

sum 

Third 

tram 

r 

Fourth 

sum 

17 

34 

1,000 

5,175 

19,809 

64,389 

22 

145 

96b 

4,175 

14,634 

44,580 

27 

156 

821 

3,209 

10,459 

29,946 

"32 

145 

665 

2,388 

7,250 

19,487 

37 

123 

520 

1,723 

4,862 

12,237 

42 

103 

397 

1,203 

3,139 

7,375 

47 

86 

294 

c 806 

1,936 

4,236 

52 

71 

208 

512 

1,130 

2,300 

57 

55 

137 

304 

618 

1,170 

62 

37 

82 

167 

314 

552 

67 

21 

45 

85 

147 

238 

72 

13 

24 

40 

62 

91 

77 

7 

11 

16 

22 

29 

82 

3 

4 

5 

6 

7 

87 

1 

1 

l 

1 

— 

Totals 

1,000 

5,175 

19,809 

64,389 

186,638 


tf 2 = 5175/1000 = 5-175 

>% = 10800/1000 = 10-800 
>% = 04380/1000= 04 380 
fl n = 180038/1000 = 180-038 

The next step is to iind the moments about the centroid 
vertical using tlio formulae on p 57, and, in this ease, as no 
adjustments* were made in the moments the r’s and //,’s are 
the same. 

/< 2 = 7-06237 /i x = -5072955 

//, 3 = 15-1060 /? 2 = 2-935 J 10 

// 4 = 172 326 

From the values of p x and y? 2 the criterion (k) can be 
calculated, and its value being — 2645 shows that Type I 
must be used (see Table VI) 

* The moments should have been adjusted by one of the methods suitable 
when the curve is abrupt These have been discussed since the example was 
prepared, and it was unnecessary to recalculate — see, however, Appendix I 
Similar qualifications apply to a few of the other examples 
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r = 5-186811 logr = 7149004 

r+l = 6-186811 log(r+l) = 7914669 

r + 2 = 7 186811 log(r + 2) = 8-565363 

* r-2 = 3-186811 log(r - 2) = -5033563 

The values* of log(r + 1), etc were checked by a Gauss- 
logarithm table. 

a x + a 2 = 15*52366 a x = 1-99638 

m 1 409833 a % = 13-52728 

m 2 = 2-776978 ]^ean — mode = 2-223116 

It will be noted that the expression V{A( r + 2) 2 + 16(r -f 1)} 
occurs m both the values of (a x + a 2 ) and m. 

The mean is at age 12 + 5-175 x5 = 37*8750, and the mode 
at age 37 8750-2 223116 x 5 = 26 75942. 

JCh^skewness is -8032. 

The calculation of logy 0 is as follows. 

logiV = 3 00000 
co log(a x + a 2 ) = 2-80901 
m x log m x = 1 84123 
m 2 logm 2 = 1 23179 
colog(r — 2) r ~ 2 = 2 39590 
lo g r(r) = 1 50406 
colog / 1 (m 1 + 1) = *05219 
colog r(m 2 + 1 ) - T- 34037 
logy 0 = 2-17455 


where, of course, logJ n (m 2 + 1) = log T f (3*776978) = log 2*776978 
+ logl 776978 + logjT(l-776978), the last value being taken 
from the table at the end of the book 

The work to this point gives as the curve for graduating the 
statistics 

409833 I x \ 2 776978 

1 1 _ 13 52728] 


y = 149 47 1 + 


where the origin is at age 26-75942 and the unit is five years. 
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The following table shows the calculation of ordinates of the 
curve from the equation just given 


Ige 

(1) 

1 h- 

(2) 

i-i- 

a, 

(3) 

l;w(3) 

(l) 

l»K (3) 

(• r >) 

W, Xtul ( 1) 

<<*>) 

m> y col (I) 

(7) 

f‘t)l ((») 

4- col (7)i 

1 log Vo 
loff f/jt 

(H) . 

Vx 

< ( >) 

(10) 

n 

02228 

1 1 1 129 

2 31792 

0 05851 

1 3229 

0 1020 

1 1)001 

15 7 

44 

22 

52319 

1 07037 

L 718O0 

02955 

1 8817 

0821 

2 1 10 1 

138 2 

137 

27 

1 02430 

99014 

0 01031 

1 998 15 

0 0012 

1 9957 

2 17 15 

119 5 

149 

32 

1 52501 

92252 

18327 

90198 

0751 

9027 

2 1525 

1421 

142 

37 

2 02592 

81859 

30002 

92870 

1257 

8020 

2 1023 

120 0 

127 

42 

2 52083 

774GO 

40257 

88911 

1050 

0921 

2 0317 

107 0 

108 

47 

3 02774 

70074 

48111 

845 50 r 

1972 

5711 

1 9429 

87 7 

! 88 

52 

3 52805 

G2081 

517b0 

79714 

2241 

4307 

1 8357 

08 5 

09 

57 

4 02950 

55289 

00520 

74204 

2481 

2853 

1 7080 

510 

51 

02 

4 53047 

47890 

05015 

08030 

2089 

1122 

1 5557 

30 0 

36 

07 

5 03130 

10504 

70109 

00750 

2870 

2 9100 

1 3722 

23 0 

24 

72 

5 53229 

33111 

7 1291 

51997 

3045 

0070 

1 not 

14 0 

14 

77 

() 03320 

25719 

78055 

41025 

3199 

3023 

8508 

72 

1 7 

82 

0 53111 

18320 

81519 

20307 

3311 

3 9535 

1022 

29 

1 3 

87 

7 03502 

10931 

81720 

03878 

3 172 

3307 

1 8525 

7 

1 

i)2 

7 53593 j 

03511 

87711 

2 51913 

3595 

5 9709 

3 505() r 

— 



Cols (2) and (3) have a constant first difference, viz 
1 ja x or 500907, and l/a 2 or *073925. The value at any point 
having been calculated and cheeked, the other items are 
formed continuously Cols (4) -(9) explain themselves, but 
we may remark that it is generally advisable to use a larger 
number of figures than five in taking logarithms, especially 
if m x or m 2 is large A little care us necessary in multiplying 
such numbers as 1*71 860 by m x (*409833) If an arithmometer 
is used, m x is put on the plate, and is multiplied by — *28134, 
and the result — 1153 must bo put m tho r ibrm 1*8847, to 
enable us to add it to other logarithms Col ( 10) gives the area, 
and was formed by applying one of the formulae on p. 52 
The area of the first group must ho treated separately, as the 
curve starts at age 16*7775, and the base of the group is there- 
fore 2*7225 m length, instead of 5 years as in the other cases. 
A good way to find the area is to calculate the ordinates for 
the middle and ends of the base, and apply Simpson’s rule, viz 

\y* dx = U2/o + 4 2/* + 2/i} 
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remembering to multiply the result by — - — to allow for the 

o 

different length of the base 

The mid-ordinate is 92-1, the ordinate at the end of the base 
is 11 6* 5, and the ordinate at the start is of course zero, the 

9 

area is approximately* 

2 72^5 

— — 2L x i{0 + 4 x 92 1 + 116 5} = 44 

O ' 

% 

Some people find it better when calculating the ordinates 
to use the form given in the JSTotes on p. 59, with the ongm 
at the start of the curve; it avoids bringing in the reciprocals 
of a x and a 2 The columns of logo; and log (a 1 + a 2 — x) can be 
formed continuously with the aid of Gauss-logarithms The 
initial values will have to be calculated and as a check one or 
perhaps two other values 



17 22 27 32 37 42 47 52 57 62 67 72 77 82 87 92 


PROOF OF FORMULAE! 

( rjQ \ f nQ \ Wg 

1 + — l ll 1 , 

where m x \ a x — m 2 /a 2 . 

* For greater accuracy use more ordmates or Tables of incomplete B -functions 
f The reader who has little acquamtance with formulae of reduction and the 
j F and B functions, should consult Appendix II befoie reading the proofs of the 
formulae for this and the other types 
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T , , , «!+* 

Let a, + a, = b and z = — . 

1 “ % + «2 

The area from, x = — a x to x = 4 a 2 is the total frequency N. 


4 , 44(4 
A 


dx 


-«i a i - 


(a a + £b)'"i (a 2 — x)™- tfo; 


2/o 


J 0 <xa^ 

J 1 2/o(«i+«2) miW 

yo(ffl! + m 2 ) m i+ m »(a 1 + a 2 ) 


[«K + a 2 )]'" 1 [( 1 - 2 ) K + « 2 )]" i2 K + « 2 )^ 

Z m x(l-2)" i ^Z 


Or ?y 0 




N 

h (m x 4 m 2 ) /w i f m 2 i 7 (w x + 1 ) 4 1) 


i?(m x 4 1, m 2 4 1) 
jT(m x 4 m 2 4 2) 


Using the name method for the moments as that just given 
for the area, we see that the nth moment, about the line 
parallel to the axis of y through x = — a x , is 

N K = /-« a^a» l * (ai + X ' )W (ctl + 

- J’ ! '” <5l± „ ~ + ’ Z ”“ H - z >”“ iz 

— + whs) mrl hl jT(m x 4 % 4 1 ) r(m 2 4 1 ) 

m 7 p ' j 4 m 2 4 n 4 2 ) 


Now, since /'(jp) = (jp — 1 ) ^(jp — 1)? the moments about the 
line parallel to the axis of y through a; = — a x are as follows: 



b(m x 4 1 ) 
m x 4m 2 4 2 

^!K+y<!VL 2 ) and so on 

(m x 4 m 2 4- 2) (m x 4 m 2 4 3) 


Changing the origin m order to get moments about the mean 
and writing m x — m x 4 1 and m 2 = m 2 4 1 and r = m' x 4 m 2 , we 
have 
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b 2 m[m2 

*( r+ i) 

2b z m'- l m'Jm r 9 — m\) * 

^° "H(r+l)(r + 2) 

^ __ — 6) + 2r 2 } 

r 4 (r + 1) (r + 2) (r 4- 3) 


We can simplify these expressions to obtain the equations on 
p 58 by writing./?! = /? 2 = and e = then 


_ 4(r 2 — 4e) (r+ 1) ^(r + 2) 2 r 2 

e(r + 2) 2 4(r+l) e 


and 


fi 2 = 


3(r+ l){2r 2 + e(r— 6)} 
e(r + 2) (r + 3) 

/? 2 (r + 2)(r + 3) 


or 


Eliminating r 2 /e we find 


3(r + l) 


2 7*2 

— +r-6 
e 


Afr + 2)« yg a (r + 2)(r+3) 
2(r+l) 3(r+l) 


Dividing out by r + 2 we have 


SA-2A+6 


From the equation 


lifr + 2)» 
4(r+ 1) 


^"2 

4 we have 

e 


* 


e = 


r 2 


4 + iAi 


(r + 2) 2 
r+ 1 


and from the equation for fi 2 


b * _ Mr +l)r 2 

e 


The other equations follow at once from r = m[ + m' 2 and 
e = The distance between the mode and mean is 

a 1 — jLt[=a l — bm[l(m' l + m 2 ), which can be easily reduced 
to the form given A general value (regardless of type) for the 
distance was given m Chapter IV, § 5 
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SECOND MAIN TYPE (TYPE IV) 

/ x 2 \~ m 

y = yo[ 1 +— 2 ) e-’*"*- 1 ** 

Origin is vafr after mean 
The values to be calculated m order are 



I | 

I 

! 

1) 


l 

L 

& 

i 

** 

1^ 

I 

■0 


m 

=* J(>’ + 2) 



V 

r(r — 

2)VAi 




-K(r~ 

-m 

a 

- 

r-1)- 

©T""* 

ST 

1 


N 



Vo 

~~ aF(r , v) 




Mode = mean — 1 - 3 - r - -~ 
2/< 2 ('H-2) 


( 66 ) 



NOTES 


The curve is skew and has unlimited range in both directions 
fi z ^nd v have opposite signs, 1 e when pb z is positive v is 
negative * * 

A simple way to calculate the curve is to put it m the form 

x — a tan 8 , y = y Q cos r+2 6e~ vd 

Then 6 is taken* as 10°, 20°, 30°, etc , and x and y found, Vhis 
gives corresponding values of x and y , but the values of y will 
not be equidistant values of x In calculating e~ vd the value 
of 6 must be taken m circular measure. If equidistant ordi- 
nates are required to be calculated accurately, little is gained 
by the double form, and if we had good tables of log (1 + x 2 ) 
and tan -1 x , the calculation of a particular ordinate would be 
a very simple matter. The calculation and meaning of F(r , v) 
are dealt with m the proof. The log of this function is tabulated 
m Tables for Statisticians When r is fairly large a close approxi- 
mation to y 0 , where tan <j> = v/r, is given by 

COS 3 0 1 , 

N I r e 3r 12 r ^ 
a J 2tt (cos0) r+1 


We appear to reach the expression that looks shortest and 
simplest with the origin as shown on the previous page, it has 
generally been used and it is therefore given. This origin has, 
however, no physical meaning and there is much to be said 
for using the more complicated looking form with the origin 
at the mean, namely 

y = y 0 1 1 + ^ _ V - j | er v tan ~Hxla-vlr) 


see table facmg p 51. 

The value of this expression when x = 0, i e. the value of the 
ordinate at the mean, is 

t$\ ~ rtl Jsf 1 


2/o l + rs = 


c t H(r, y) 
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where H(r, v) is a function related to F(r 9 v) Its logarithm is 
also tabulated m Tables for Statisticians The reader will 
appreciate at opce that this curve needs considerable care, 
it is the most difficult of all the Pearson-type curves 


EXAMPLES 

The numbers m the following nearly symmetrical distribution 
represent the exposed to risk of sickness by SftttoiTs Sickness 
Tables (males — all durations) # when the number of weeks 5 
sickness is represented by the normal curve of error. 


Central age 

No exposed 

Graduated by 
Type IV 

5 

10 

6* 

10 

13 

16 

15 

41 

49 

20 

115 

135 

25 

326 

321 

30 

075 

653 

35 

1,113 

1,108 

40 

1,528 

1,535 

45 

1,692 

1,712 

50 

1,530 

1,522 

55 

1,122 

1,074 

60 

610 

604 

65 

255 

274 

70 

86 

102 

75 

26 

32 

80 

8 

8 

85 

2 

2 

90 

1 

1 

95 

1 

^ . 


9,154 

9,154 


* This group has been taken as tho area of the rest of the curve 

The following values were obtained: 


Mean = 44 5772339 


•0053656 

/t 2 = 4 527608 

A — 

3-169897 

ji i z = - -705687 

K = 

•0125 

/t 4 = 64-98048 
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Type IV was used because, as there is a large number of 
cases, the standard error of k will be small (see Chapter X) 

r = 40 12143 

v = 4-450398 (positive because ji 3 is negative) 
a =’13-39152 
to = 21 06072 
Sk =--03313 

<9 * 

When the 5-years umt with which we have been working 
is changed to one year, a becomes 66-9576, and a 2 = 4483*325 

The origin = mean + va/r 
= 52 504394 

The mode, which is wanted if the curve is drawn, is at 
44-92989 

As r is large the approximate form for y n was used, 

4-450398 . 8925 

tan <j> = - 121 - ^ , or, log tan <p = log tan 6° 19' hence 

log cos (j) = 1-9973446, and from this y Q is found to be 273-3649 
The value was checked by the tables m Tables for Statisti- 
cians 

The calculation of ordinates by the double process is as 
follows: 


9 

X 

m years 
of age 

- 4 450398 9 log e 

% 

42 12143 log cos 9 

logy 

y 

0° 

0 



2 43675 

273 37 

1° 

1 1687 

I 96637 

I 99721 

2 40033 

251 38 

2° 

2 3382 

I 93253 . 

I 98885 

2 35813 

228 10 


The second column is formed directly from the tables of 
tan 6 by multiplying by a, and as x is required m years, 
13*39152 x 5 = 66 9576 should be used for a The fourth 
column is formed by multiplying log cos 6 by r + 2, and the 
third continuously by addition When 6 is negative, the third 
column has to be subtracted from the fourth: i e.it ceases to be 
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negative and becomes positive In each case the fifth is formed 
from the fourth + the third + log y 0 r . 

When drawing a curve of this type the position and height 
of the mode can be noted and then corresponding points 
inserted, eg v —ri- \ 1687 and y = 251 38 Care must be 

~ c 

taken to give the curve its maximum at the right pomt 
If the calculation is made directly, the following columns 
can be used 


r/a 

1 -\-i 2 /a 2 

log (l -H 2 /« 2 ) 

tain -1 ija 
mdegiees, 
etc 

eol (4) m 
cucuUi 
measuie 

r co1 ( 5 ) 
v ( ~V logiof) 

— wxcol (3) 

lOgy. 0 
+ (6)+(7) 

y — 

antilog 

(1) 

( 2 ) 

(3) 

(4) 

( 5 ) 

(<>) 

(7) 

( 8 ) 

( 9 ) 






i 

1 

i 



Col. (2) can be formed by differences since A ( 1 + X 2 ) 
8=5 2X+1, tan~ x ^/a has to be found by using a table of the 
tangents of angles inversely A table helpful for obtaining 
col. (5) from col (4) will be found in Ohambv, rs’ Mathematical 
Tables or in Tables for Statisticians. 

The troublesome work of inverse interpolation m degrees, 
minutes and seconds can be avoided by numbering the items 
m a table of tan# from 0 onwards Chambers’ Tables, for 
instance, give tangents for each minute m the following form* 


1 


l” 

2", etc*. 

0 

0000000 

(00) 017455] 

(120) 0349208 

1 

0002909 

(61) 0177460 

(1 21) 0352120 

2 

0005818 

(62) 0180870 

(122) 0355033 

3 

etc 

0008727 

(63) 0183280 

(123) 0357945 


( 70 ) 



If in the column headed 1° we insert 60, 61, etc , and in the 
column headed 2° we insert 120, 121, etc , as indicated by the 
figures m brackets, we can make the mvei^e mterpolation 
m minutes. Then, as one minute m circular measure is 
*0002908882, we can obtain the figure we ^require by multi- 
plying by the conversion factor In practice however it would 
be combined into one multiplier with ( — vlog 10 e) and col (6) 
would be found directly from col (4) by multiplying, m our 
example, by 003519003 The labour of inserting the minutes 
in a printed table is small, as all we need to do is to write the 
number of minutes under the dumber of degrees at the head of 
each column and add thereto at sight the marginal minutes 
when the interpolations are being made. 

Tables of tan” 1 6 , etc will be published shortly (Tracts for 
Computers , No xxiii) and these tables will simplify the cal- 
culations. 
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PROOF 


In 


1 + 


Now 


y = 2/oli+ a 2 

, X 


.2 \ —m /v» 

e -vtan~ 1 j z/a p U t tail 6 = - 


6 = tail" -1 - and r 

a 

2^ —m 

= {1 + tan 2 0} _m = (sec 2 0)~ m = cos 2m 0 


y = y 0 cos 2m 0e~ vB 


-r: 

rbr 

J -i 


.21 —?n 


y °i 1 + a* 


r c—v tan -1 


X > a dx 


y 0 cos 2 ” 1 Oer v0 - -- ■ ^ 00, by substituting 

dx 


tan 0 = - so that = a sec 2 0 = — ~- a 

a dO cos 2 0 

fin 

= v/ 0 a cos* dO where r = 2m — 2 


= ?/o<^ 


J -&7T 

Jl'TT 


f sitF <pe v, t‘ d<j> , 


substituting sin<j5 for cos 0 so that Jtt = 0 + <f> and changing 
limits, = y 0 aF(r, v), say. 


The nth moment about the origin is 


1 f M 

< = -jj I jx“dx 

If” a 2 ' 

( 1+ # 


21 -W 


_ i r 

_ y 0 n m+1 f l7r 

~TJ 


~p tx\xi~ l ala fa * 

y 0 a” +1 cos 2m " 2 0 tan w ()er vQ dO, by substituting as above, 
cos , ~ w 0 sm ?l 0e~ v0 dO 


in 


Vo 0,11+1 f cos’ 1- " + 1 0 sin” -1 6er v0 

N _ r — n + 1 


r ( cos r ~' tt+l o 

- J j — — — — ■ [sm w “ 2 6 cos Oer v0 (n — 1 ) — ver v0 sm n 
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by integrating by parts and treating sin’ 1 - 1 6e~ v6 as one part 
and cos’—™ 6 sin (9 as the other, and remembering that 

PaqJ' — T t+l f) 

cos’-™ 0 sin 6dd = - — f . 

^ r—n+l 

Now, since cos ’-”- 1 0 sm ,l_1 6e~ v0 = 0 wbpn 6 becomes \n or 
— \i t, we have 


K = 


y 0 a n+1 
N(r—n+ 1) 


f*i7T 

{{n — 1 ) cos r '" w+2 6 sin n “ 2 de~ vd 

J -hr 

— v cos r ~ n+1 6 sm 71 " 1 6e~ v6 } dd 


= r _ n ^ \ {( W - 1) ■ a <-2 ~ V K-l} 
Further, 


/h. 


= ^ ^ cos” $ tan 6&~ v ® dd 

— N J- ^ 


Vo** l 
Nr | 


fhTT 

1 

J -irr 


v cos r der v0 dd 


by putting n — 1 m the above equation for [i' n 


av 


because N 


r%7T 

= Vo a c 
J -in 


cos r de~ v 6 dd 


Using the last result with the formula for the nth. in terms 
of the two previous moments, and remembering that ji' 0 is unity, 


/^2 


r(r-iy^ } 




a 6 v 


r(r — 1) (r — 2) 


(3r — 2 + v 2 ) 


- {3 r(r -2)4- j; 2 (6r — 8) + v 4 } 


r(r — 1) (r — 2) (r — 3) 

Referring these moments to the centroid vertical, we have, 


by putting d = [i[ = — — m the formulae on p 57, 
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cv 


4 ah ’(? 2 + v 2 ) 


As 


r 3 (t -l)(r- 2 ) 

3a 4 (r 3 + 1 > 2 ) { ('/■ + G) (j^+jA) - 
? 4 (r — 1 ) (r— 2) (r — 3) 

If now, we put z for r 2 + v 2 , and write as before, 


/'4 = 


2 


and /J 2 = ^ 

a! - aI 


we have 


and 


2(r— ]) 2 8 

&(r-2)(r-3) 8r* 


Adding and dividing out by r — 2, we have 


and 


r = «(A r A - 1) 

3/y^ — b 

Z= ,_Mr- 2 )* 

IG(r-l) 


Finally, since r 2 = as — r 2 , the other formulae on p. 60 follow at 
once 

Since the tangent at the top of the maximum ordinate is 
parallel to the axis of x, the position of the mode is such that 
dyjdx is zero at that point, i e. 


; 2 \ — On 1 - 1 ) 


y o 1 + 


/2 


& 


-v^HaY _ 2mX _ V ~ 
a 2 


is zero There are three cases, x = — oo, x = + oo, and a value 

2 mx p pa 

of x such that —5- + - is zero, or x = — — The distance of the 
a 2 a 2m 

mean from the origin is /.i[ or and, therefore, the distance 
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between the mean and mode is — . which reduces to 
? r(r + 2) 9 

the expression given on p 66, when the vajues for v and a , 
on th§ same page, are inserted 

It will be useful to give another examplq of the calculation 
of y 0 for curves of this type, and we may take a curve where 
r = 29 590, v = 19 886, a — 13 650, N = 2162. Hence tan ^ 
= 67205, ^ = 33° 54'^-, cos^= 82998, logcos^ = l 91907, 
and (j) in circular measure is 59172. 




i<jg^ 

= 3-33486 



colog a 

= 2-86486 



Jlogr 

= 73557 

cos 2 <f> 
~3r~ = 

00776 

log ^ 

= T- 60091 

1 

~12 r ~ 

- -00282 



1! 

1 

-11 76700 




-11-762 

X l°gl0 e 

= 6-89183 


colog(cos^) r + 1 - 2 47564 
1-90367 
y 0 = ~80l07 

The form jus*; considered is sufficiently accurate for all 
practical purposes provided v is not very small If v < 2 the 
tables m Tables for Statisticians must be used 
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THIRD MAIN TYPE (TYPE VI) 

r 

y — y Q {x — a) n - t x~' 11 
Origin at a before start of curve 

r 

r 

The values to be calculated in order are 

r 6(A-A-1) 

6 + 3&-2/?, 

a = + ^( r + 1)} 

g 2 and - are given by 

r — 2 r(r + 2) / /? L 

~2T" ~~~ 2~ ' V ^(rT2)*Tl6(r+lj 

Na , h -, h ~\ 

Origin = Moan — — — — --- 

Mode = Mean — * - ,3 . ^ ^ 

2 //, 2 r — 2 

If expressing curve with origin at mearf (see Table VI, 
facing p 51): 

A „ 4 = ofe + 1). 

= ^(ga + 1 h <gi - f h~ 2 )' h ~' h Agi) 

Ue 0(2! - 1 )' /l A2 i - 2a - 1 ) r(g, + 1 ) 
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NOTES 


The range is from a to oo and the method is like that of 
Type I If ju, 3 is negative, then a is negative and the range is 
from —co to —a. » 

f* 

r is always negative and q x is greater than g 2 If g 2 is negative, 
the curve is J -shaped 

The value of y Q does not correspond to any frequency, as it 
relates to a ponffc before the curve starts 

The reader will probably find it easier to work with the origin 
at the mean, and m the numerical example both forms are 
shown 

EXAMPLE 

The number of entrants, limited payment policies, 1863-93 
experience was summed m groups of ten years of age and 
divided by 100, and the following series was obtained: 


No of entrants 
-100 

Graduated by Type VI 
curve 

1 

1 

56 

50 

167 

168 

98 

100 

34 

36 

9 

10 

2 

2 

1 

5 

368 

368 


The moments,* etc. were 

Mean at -402174 after the centre of 167 group 


>«2 = 

928835 

1 — 2i = 

-41-03080 

^3 = 

•893096 

I + !?2 = 

7 60950 

/<4 = 

4 088800 

<h = 

42-03080 

A = 

•9953605 

S'2 = 

6-60950 

A = 

4 739349 

a = 

10-37947 

K — 

T = 

1-895 

-33-42129 

log 2 / 0 = 

46-1821 
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The origin is 12*74270 before the mean or 12*34053 
before the centre of the 167 group, and the curve starts at 
12 34053 — 10 37949 = T96106 before the centre of the largest 
group This makes the start of the curve at about age 10, 
which is reasonable 

If we use the origin at the mean, we have 

A ± = 12*74270, A 2 = 2*36324, y e = 147*4 

and the range is from — 2 36324 to oo 
The curve was calculated as follows 


X 

(1) 

log X 
(2) 

log (x-a) 

(3) 

' f ' 

- <h log X 
(4) 

q^logiv-a) 

(5) 

logy 

(6) 

y 

(7) 




! 

1 





There is no difficulty m writing down the values for c<5Ts (2) 
and (3) without using eol. (1), as only the whole numbers m 
x and x — a change, the decimal remaining constant so long 
as equidistant ordinates are required. Cols (4) and (5) are 
obtained directly, and col. (6) by adding cols (4) and (5) to 
logj/o Cols (2) and (3) can be formed continuously with the 
aid of Gauss-logarithms 

The mode which is useful for drawing the curve is *02429 
before the centre of the largest group. 
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With the origin at the mean the form of the columns is 
similar to that already shown for Type I 


, PROOF 

r* co 

N = I y 0 (x~a)^x~ Q ^dx •» 

= J y 0 a^~^ ^ — lj ‘ffli ( - azr 2 ) dz 

by substituting l/z for xja 
= J y 0 a g 2 _ 3i + 1 ( 1 — z)9a z« l-fe- 2 dz 

N 

Vo ~ a^-^ 1 Bfa + 1, 91-22-1) 

Nr fa) 

~~ T(st+i)rfa-q a -i) 

The nth moment about the origin is 

K = ft J Vo xn ( x - a) q "-x-^dx 

= yo Agi-fe-w-^rfe+l) 

Na q i- Q a-™-! J n (g 1 — 7i) 

by the same substitution as that used above. 

From this last result we obtain, by inserting the value of y 0 , 
and remembering the relationship between r (q^ and r(q x — 1*), 
etc , 

/i 




/^2 


etc 


gi-g 2 - 2 

« 2 (gi -i)(g!- 2 ) 
(gi-ga- 2 ) (gi-ga-3) 


It will be noticed that these equations are the same as those 
obtained for Type I if m 1 = - q 1 , m 2 = q 2 and b = a Thus, we 
can use the whole of the Type I solution, provided we bear m 
mind that the range is from x = a to x = co 
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TRANSITION TYPE 


“NORMAL CURVE OE ERROR 

y=y 0 e~ xVc 


C = 2 <7 2 
N 

Vo ~ «J(2n[i 2 ) 



NOTES 

This curve has been known by various names, such as the 
Probability Curve and the Gaussian Curve ^It was discussed 
before Gauss by de Moivre and Laplace. It is the limit of 
(p + q) n where p + q = 1, when n approaches infinity and if 
neither p nor } is very small. It gives a close representation 
of (! + \) n even when n is not large 


* EXAMPLES 

The following table gives, m col. (2), the sums assured and 
bonuses, and m col (4) the reserves resulting from grouping 
a number of Endowment Assurances according to their office 
years of birth* 


Central age 
for groups 
of 5 years 
of birth 

(i) 

Sxjms assured and bondses/1,000 

Reserves/ 1,000 

Ungraduated 

(2) 

Graduated 

(3) 

Ungraduated 

(4) 

Graduated 

(5) 

17 

11 

13 

6 

*6 

22 

48 

40 

2 8 

28 

27 

124 

104 

11 5 

10 9 

32 

213 

202 

27 7 

30 1 

37 

281 

282 

59 1 

58 4 

42 

295 

288 

84 7 

79 9 

47 

185 

214 

74 1 

77 0 

52 

104 

116 

50 5 

52 2 

57 

40 

44 

23 2 

25 0 

62 

15 

13 

12 2 

84 

67 

3 

3 

1 3 

24 

Total 

1,319 

1,319 

347 7 

347 7 


% 


The following table shows the moments and constants 


Constant 

Sum assured and bonus 

Reserves 

Mean age 

^4 

ft 

A 

K 

a( = V^ 2 ) 

a 1 

Vo 

39 202426 

3 066840 

650127 

27 02516 

014653 

2 873346 
- 005 

1 751237 
5710248 

300 4760 

43 967213 

2 769635 

029805 

22 40663 

0000418 

2 920997 
- 0002 

1 664222 
6008813 

83 34959 
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The criteria for the normal curve are k = 0, /? x = 0, and 

= 3. The values given above do not differ very greatly from 
these, but a comparison of the graduated and ungraduated 
figures shows that the reserve curve agrees better th^n the 
sum assured curvp, partly because the value of /? 2 is closer 
to 3, and /? x has a larger value m the case of the sum 
assured 

For the calculation of y 0 the value of 

colog 42n = T 6009100657 

is required 

In finding the areas for the comparison between the 
graduated and ungraduated figures it is unnecessary to 
calculate the ordinates, as one of the calculated tables of the 
probability integral can be used. The table by W F Sheppard 
included m Tables jar Statisticians is very convenient, and the 
columns in the table below show how it was used to calculate 


Age 

X 

Distance from 
origin m 
calculation 
units, 1 0 

5 years of age 

Previous 
column 
x a *" 1 

Values of 
„ i(l+a) 
from Sheppard’s 
tables using 
differ onces (aiea 
fiom ongin tou) 

Difference 
of previous 
column 
= area f 01 
ago gioup 
x to x -l- 5 

Area 

multiplied 
by 347 7 
(total) 
ficquency) 

14 5 

5 893443 

3 541258 


00164* 

6 

19 5 

4*893443 

2 940377 

99836 

00802 

28 

24*5 

3 893443 

2 339496 

99034 

03139 

10 9 

29 5 

2 893443 

1 738615 

95895 

08657 

30 1 

34 5 

1 893443 

1 137731 

87238 

10806 

58 4 

39 5 

*893443 

•536853 

70432 

22985| 

79 9 

44*5 

106557 

064028 

52553 

4> 2141 

77 0 

49 5 

1 106557 

664909 

74694 

15018 

52 2 

54 5 

2 106557 

1*265790 

89712 

07190 

25 0 

59 5 

3 106557 

1*866071 

96902 

02418 

84 

64 5 

4 106557 

2 467552 

99320 

•00572 

20 

69 5 

5 106557 

3 008443 

99892 

00108* 

4 


* Remainders of areas beyond 19 5 and 69 5 

f (*70432 - 50000) ~H( 52553 - 50000) because we pass aeioss the origin, and 
a piece of the group is on each side of it 


the areas m one of the cases (the reserves) Sheppard’s tables 
give the areas and ordinates of the normal curve in terms of 
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the standard deviation, that is, he assumes the standard 
deviation to be unity, and his tables must be entered by 
using intervals df <r -1 A short abstract from Sheppard’s table 
is given on p 265 Most other pubhshed tables are based on 
the standard deviation multiplied by ^2 and the distinction 
must be born© m mind if other tables are used 

The second column can be left out when the method has 
been grasped The ages m the first column were taken con- 
sistently with ^the assumptions that 17, 22, etc. were-, the 
central ages of the groups 

“NORMAL CURVE OF ERROR” 

Sums 

Assured Reserves 



If ordinates are required, the z column m Sheppard’s tables 
must be used. It was with its help that the curves m the figure 
were drawn The statistics and curve for the reserves are 
shown by the dotted hnes. 
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An average reserve for any group can be obtained by means 
of the graduated figures, and it could be used to test the reserves 
obtained at any future valuation. This is by n r o means the only 
rough check that" can be applied, but it is interesting because 
it shows a use to which frequency-curves might be put m 
practical office routine 


To show that 


let 


PROOF 

1 O / Jn 
e~ x ~dx = 
o 2 


er^dx — k 


then, substituting ax for x, we have 


j; 


e^adx = k 


Hence 

But 

and 

Hence 


* CO 

e 

J o 


r* oo 

J 0 * 


> i a a dx = /ce-“ a 

» /’'OO 

= a : er^da — 

) J o 






g— a 1 /! Y‘t‘) a da — 


1 f* jfc 
2Jo 1 


+ £ 2 


,-2 = 


7 T 


or k 


1 1_ 
2 1+x 2 


s/n 


r oo 

e~ x 'dx = % In 

J - 00 
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The other constant is obtained as follows - 


J 2/o e “ 2/e dee = y 0 j~xe~ x2/c + e~ x ~ lc xdx^ 

= %«, f “ 

C J _ 0 


x 2 e~'^ !c dx 


N = 


2N 


c = 2/f 2 


by parts 

00 
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TRANSITION XYPE (TYPE II) 

/, x 2 \ m 

y=y°\ l -a*) 

Origin at modo ( = mean) 

— 9 

„t _ 

■ 

Nx/ 7 (2m + 2) 

?/o “ «. x~ 2 2 "'+ 1 { Z’(m + 1 )}* 

N [(m+V,) 
a\jn r(m + 1 ) 
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NOTES AND PROOF 


Put j3 1 = 0 in Type I, for the curve is symmetrical, and 
therefore pc z = 0 For the same reason it is clear that m 1 = m 2 

Approximations to F may be used if m^s large. 

If m is positive, the curve starts at zero, rises to a maximum 
and falls again to zero, but if m is negative, it starts at infinity, 
falls, and then rises to infinity again, 

n ■» 

EXAMPLE 

In the discussion that followed the reading of G J. Lidstone’s 
paper on Endowment Assurances, G F Hardy said that “the 
errors in the successive groups formed a curve very similar to 
the normal curve of error 35 ( J. Inst Actu. xxxrv, 87), and the 
series m question is a rather interesting example of a sym- 
metrical distribution 


Unexpired term m years 

Error involved in nsmg 
“mean age” method 

0-4 

11 

5-9 

116 

10-14 

274 

15-19 

451 

20-24 

432 

25-29 

267 

30-34 

116 

35, etc 

16 


1,683 


Moments wele calculated about the centre of the 15-19 
group, and 4985146, 2-161022, 3 104576, and 12-60666 were 
found for the first four moments, transferring to the mean 
(17 5 + 2 492573 = 19-992573), and using Sheppard's adjust- 
ments, the following values result 

/e 2 = 1-829172 j8 ± = -0023706 

jLi z = -120452 /? 2 = 2-548313 

/* 4 = 8-52636 k — — 007492 

which shows that Type II can be used. 
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The equations for the type give 

m = 4-141766, a = 4-543079, y 0 =r 462-57 

The mean and mode coincide, because the curve is symmetrical 
For calculating a series of values, the following arrangement 
is convement ' 


X 

d 

t 

log ( 1+ l) 

lo g(l~g 

(2) + (3) 

logy, 

~m x (4) 

+ l°g Vo 

(1) 

(2) 

(3) 

(4) 

(5) 







It is easier to work in this way than by calculating values of 
1 — a ; 2 /a 2 . [n the particular example, ordinates were calculated 
at the beginning, middle, and end of each group, and Simpson’s 
quadrature formula was used tor finding the areas, viz. 

jyfc - 


Group 

Ungraduated 

figures 

Areas 
Type 11 

Miri-ordmatOH, 
Typo U 

Areas, 

“Normal curve” 

0-4 

11 

14 

11 

22 

5-9 

116 

109 

104 

95 

10-14 

274 

280 

287 r 

270 

15-19 

451 

433 

440 

455 

20-24 

432 

433 

440 

455 

25-29 

207 

285 

287 

209 

30-34 

116 

109 

104 

95 

35, etc 

16 

14 

U 

22 


1,083 

3,683 


3,083 


A comparison of the mid-ordmates with the areas gives an 
idea of the error involved in using the former for the latter, 
the differences are largest at the “ tails” and near the 
mode. 
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The curve starts at 19*992573 — 22*71540 = —2*72283, and 
ends at 42 70797. 

The final column of the table gives a graduation by the 
ctf normal curve”. 


TYPE II 


i 






TRANSITION TYPE (TYPE VII) 


y = y o 



Origin at mode (= mean) 



NOTES AND PROOF 


The curve may be taken as a special case of Type IV when 
v — 0, or it can be evolved from Type II by making both m and 
a 2 negative in that type This happens whefj. /? 2 > 3 The curve 
is symmetncarand of unlimited range m both directions 

(*+co / ™2\ —m 

JV = J_/°( 1+ ^) dx 

* 

f 00 / o*2\ — m 

ix 

then putting 1 + — = z “ 1 J the reader will be able to show that 

N = J y 0 a( 1 — z)~* z m ~^ dz 

= ay 0 B(m-ir, |) 

or y 0 has the value shown on the preceding page, because 

m = 

EXAMPLE 

The following table gives the areas when /? a = 5 and /i 2 = 1 
and shows a graduation by the “ normal curve 55 The example, 
together with that of Type II, will act as a reminder that the 
“ normal curve ” does not give entirely satisfactory results even 
with symmetrical distributions 


Type VII 
m—4, a 2 — 5 

Normal curve 
(7 = 1 

1 


1 


2 


4 


7 

1 

16 

5 

38 

24 

93 

93 

225 

278 

527 

656 

1,106 

1,210 

1,858 

1,746 

2,244 

1,974 

1,858 

1,746 

etc 

etc 
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TRANSITION TYPE (TYPE III) 


y = y o 



Origin at mode 


7 = 


2/fa 

/':> 


v-ya- r 
a = "¥\- til 

A 3 2/t 2 


AT 

a ' e**/ \p + 1) 


Mode = Mean — ~~ 

2(h 


If expressing curve with origin at mean (see Table VI, facing 
p. 51): 

\r (j p + 1 y 

y e - N.y.- 



NOTES 


The curve is usually bell-shaped, but becomes d -shaped 
when p < 0, that is, when /? x > 4. The range* is limited m one 
direction only The criterion is that 2/? 2 = 6 + 3/? x Theoretically 
this gives k = oo but the curve may be used in many cases where 
k is not very large, provided 2 approximates to 6 + 3/? x 
When fi z is positive y and a are positive, so that the range is 
hmited at a distance of a before the mode, when is negative 
y andct are negative, so that the range is limited at a distance a 
after the mode. If, however, /3 X > 4, then a and y have different 
signs 

EXAMPLE 

The following statistics are taken from a paper m the Trans . 
Actu Soc Edmb iv, 44, and give the numbers of wives 
tabulated for the ages of mothers, and according to years since 
marriage The mothers’ ages for the particular series are 30 to 
34 


Year after 
marriage 

Number of 
wives 

Graduated by 
Type III 
curve 

1 

44 

59 

2 

135 

111 

3 

45 

45 

4 

12 

20 

5 

8 

9 

6 

3 

4 

* 

1 

2 

8 

3 

1 

Total 

251 

251 


The mean is *3346612 after the middle of the second group, 
and the moments about the centroid vertical are 1 441787, 
3 606622 and 18 93221, so that k = - 8*44 
As this value was large, Type III was used, and 

y = *7995221 a = - *098007 

jp= -*0783584 y 0 = 214*8 
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This example is given because it can be used to show a 
difficulty rather clearly At first sight, a curve starting at zero, 
rising to a maximum, and then falling, might be expected. 
Instead, we find the curve starting at duration 68192,* so that 



the first group is made up of a strip on a base -31808 in length, 
and has a smaller value than the next group, though any 
ordinate read off within the first group would bo larger than any 
ordinate m the second group. No adjustment was made to the 
rough moments 

* Tho inode in ordinaiy cases ot Typo III is given by mean - ! In this 

case =1 25075, so the mode would be at 5839 1, and tho curve would start 

’ 2fx z 

at {“mode” - a} = 58391 + 09801 = 08192 



PROOF 


( x\ ya 
1 + -1 e~ yx , put 

ya = p, and substitute z for y(a + #)> then if N be the total 
frequency, 0 


= J* y 0 z p a~ v e~ z+p y~^+ 1 ) dz for ~ = y 


dz 


= Vo 


eP_ 

ypP 

aeP 


z v e~ z dz 


This gives y 0 


y°pp+ 

NpP + 1 * 


ae p F(p -f 1) 

The nth moment about the start of the curve is 


1 f 00 / 


e~ yx (x + a) n dx = 


V 


o ro 

^Jo 


Np p y 
F{p + n+_ 1) 
y n T(p + 1) 


z P+n e -zd z 


by using the value of y 0 found above 


p + 1 


Since r(p) = (p — 1 ) r(p — 1), the first moment is 1 — , the 
second (P + 1 )(P+ j), an d the third fr+ l)(g + 2)(? + 3) In 


r 


7° 


order to apply these formulae to statistical work, it is necessary 
to have moments about the centroid vertical, the position of 
which (the mean) can be found, and as, by definition, the first 
moment about it is zero, we get 


/V 


.H+! andA -?<2±i) 


y 


These results give y and p as 


2A ' 2 and 


H 
( 95 ) 
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TRANSITION CURVE (TYPE V) 


y = y 0 %- p e,~ ylx 

Origin at start of curve 

p = i+ *±Mt±M 

3 + p x 

r = (p-%)*J{yz(p- 3)} 

__ NyV” 1 

y ° = r(f- 1 ) 

y 

Origin = Mean 

2y 

Mode = Mean — 

2>(P ~ 2 ) 

The sign of y is the same as that of // 3 . 

If expressing curve with origin at mean (see Table VI, facing 

p. 5L) 

A = yl(p- 2) 

N{p — 2) P 

Ve ~ ye^- 2 r(p-l) 



EXAMPLE 


The following series of deaths is taken from G. King's paper 
“On the rate of mortality amongst female nominees, etc.” 
(J. Inst Actu xxxm, 262-8). 


Ages 

Deaths 

Graduated by 
Type V 

30-34 

1 

1 

35-39 

5 

3 

40-44 

8 

6 

45-49 

12 

14 

50-54 

28 

32 

55-59 

82 

68 

60-64 

128 

137 

65-69 

253 

247 

70-74 

342 

381 

75-79 

525 

480 

80-84 

438 

441 

85-89 

265 

261 

90-94 

53 

80 

95-99 

18 

10 

100, etc 

4 

1 


2,162 

2,162 


The mean is at age 75*9782605, and the moments (adjusted), 
etc are 

= 3*573346 fi 1 = *4950399 

li 3 = - 4*752613 fi 2 = 3*996134 

fa = 51 02583 k = *85 

Strictly speaking, Type IV should be used, but the value is not 
very far from umty, and the following Type V constants were 
found 

p= 37 29145 

y = —390*6609 (negative, because fi z is) 
log y 0 = 56*930518 
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The approximation to the value of log F(p — 1) was used The 
origin is at age 131 32606, and the mode at 78*9467 

The columns used for calculating the ordinates were: 


a 

(i) 

to 

O ' 

-P H * 

(3) 

^(-r lo sio«) 

(4) 

log?/ 

= lo Syo + (3) + ('4) 

(5) 

y=antilog (5) 

( 0 ) 




! 

' 1 




Col. (4) is best formed by putting ylog 10 e on the plate of the 
arithmometer, multiplying it by l/x, obtained from a table of 
reciprocals and reading oft the result negatively 

The point to be borne in mind in drawing a curve of this type 
is that as the mode and origin are not at the same place, care 
must be taken to give the maximum ordinate its right position 
and magnitude (of Type IV). 

The graduated figures agree fairly closely with the original 
statistics below the 90-94 group, but are unsuitable for that 
and the two later groups. The reason is that Type IV, having 
an unlimited range, should be used The pnr ticular case was 
chosen partly because an example in which /^ 3 is negative is 
rather more awkward than when // 3 is positive In such cases it 
is a good check to imagine the statistics written m inverse 
order (in this case 4, 18, 53, etc.), and so avoid the negative 
signs 




PROOF 

Putting yjx — z m y = y 0 e~y lx x~ p } and integrating from 0 
to oo, we have N = y^y 1 ^ F(p — 1) 

Ny v ~ 1 

or , »>-75=T) 

Using the same substitution, the nth moment about the 
origin is , 


[i' n = |?y»-3>+iJ" e~ z zP~ n dz 

— ^y n ~P Jri r(p — n-- 1 ) 

ny-.-l) 

T i>-l) 


This gives 
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which is the distance between the mean and origin, 


7* 


(p-2)(p-3) 




r ' S (P~ 2)(p-3)Cp-4) « 

Transferring the moments to the centroid vertical, 

r 2 


and 


fit 
l-H : 


( 3 ,_2)«(p-3) 

4y 3 ' 

(p-2) 3 (^-3)(i3-4) 

/tf 16(;p-3) 16 


/? __ £^3 ___ ^\jr 

^ //2 — 


16 


and 


(3?-4) 2 p-4 (33 — 4) 2 

. 16. 16 
<*>-*)— A “ 


49 ~ 4 will have to be taken as the positive root of the equation, 
or y, which from the above equations is given by 

(3>-2)VW3>-3)}, 

will be imaginary. 

Since the tangent to the curve at the top of the maximum 
ordinate is parallel to the axis of x , the position of the mode is 

-p~ip~yU ! -f Zj 1S zero 


such that dy/dx is zero there, i e y 0 x' 
x = 0 and x = 00 give the cases in which the curve touches the 

^1 

y 

axis of x, and the other case, the one required, is when ~ = 0, 

x 

y y 

or x = i.e the mode is ~ from the origin 
35 P 


Uncommon Frequency Types 

Up to the present we have dealt with common types of 
frequency-curves, but in the course of statistical work a dis- 
tribution is sometimes found which appears different in its 
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algebraic form from the usual types, but can nevertheless be 
described accurately by those types. An example which will 
give an indication of the kind of case we h#ve m mind is a 
distribution arising from recording the number of sequences in 
coin-tossing or dice-throwing experiments : l^he distribution is a 
geometric progression and this, a well-known result m proba- 
bility, is a special case of Type III if p = 0, for we then obtam 
the exponential e~ yx which gives the series we want Certain 
limiting cases of Types I, II and VI give straight lines, curves 
starting with an infinite and ending with a finite ordinate, two 
separated blocks of frequency, and curves starting at a finite 
ordinate and ending at zero either at a finite point or at in- 
finity: among these last is, of course, the exponential to which 
we have already referred. 

Before turning to the expressions for these new types it may 
be useful to give a table of various peculiar distributions that 
have been obtained from insurance and other material. 


Examples of uncommon Frequency Types 


469 

186 

166 

134 

122 

112 

45 

38 

46 

53 

43 

38 

49 

41 

44 

52 

y 

119 

100 

86 

75 

61 

50 

39 

27 

22 

12 

3 

4,165 

2,028 

982 

480 

266 

132 

71 

36 

17 

9 

2 

1 

1 

1 

1 

33 

53 

65 

81 

101 

131 

186 

350 

68 

24 

17 

14 

12 

11 

10 

10 

10 

11 

12 

20 

1,189 

449 

594 

8,192 

1,000 

219 


The table includes (col. 6) areas of a U-shaped curve which 
is rare, in fact, I have not succeeded in finding a suitable 
distribution of this shape among actuarial statistics, but such a 
distribution might occur among terminations (including with- 
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drawals) m term policies of ten years, say, or similar endowment 
assurances. 

We may now clpaJ with these cases, but we shall discuss them 
in less detail than the more important types 

Type VIII 

1+ a) 

Range from an infinite ordinate at — a to a finite ordinate, y 0) 
at 0. r 

m is found from the solution of 

m 3 (4-/? x ) + Qfi 1 -12)- + 16/? x = 0 

and must be neither < 0 nor > 1 

/ft v 13 — m 

as=+<r(2 — m) /- 

~ a/ 1— m 

2/o = N{\-m)ja 

The distance of the mean from x = —a is a(l — m)/(2 — m), 
and from ^ = Ois — aj{2 — m) When fi z is positive a is negative 
If we use the form witli origin at mean (see Table VI, facing 
p 51) 

A = a(l-m)l(2 — m) and y e = N(l—m) (2 — m) ni /a(l — m) m 

The curve is a special case of Type I when m 2 is zero, that 
is, when 

r ~ 2 = r(r + 2) V&Wi(r + 2 ) 2 + 1)]} 

where r = 6(/J 2 -fi 1 -l)/(G + 3 ft x - 2/? 2 ) 

Thus the test for the suitability of the curve is that 

(¥2 z 3 A) ( IQ A ziMiz + 3 ) 2 z z 12 ) 

(Sfix — 2/? a + 6) {/?x(/? 2 + 3) 2 + 4(4/? 2 — 3/? x ) (3/? x — 2/? 2 + 6)} 
or A, say, is zero 

The criteria for Type VIII can be reduced to (1) special case 
of Type I, (2) A = 0, (3) — — ^ is negative It may be 

added that 24:/3 2 — 2 t lfi 1 —3& is small; theoretically positive 
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If ^ = 0 an interesting special case arises, in which m = 0, 
and the curve becomes a horizontal line, which is also the limit 
of Types IX and XII 

The solution of the cubic for m gives trouble m can also be 
found from m— — 2(5/? 2 — 6/? x — 9)/(3/? 1 — 2/? 2 -h 6), and though 
this involves ft\ it should theoretically give' the same value of 
m as the cubic As the criterion is not exactly reached in 
practice the two results differ, and it seems preferable to find 
m from the cubic by using 

24& - Vfgglf - 6jg 1 [m > (4 - ft) + 9/? 1 - 12]} 
2{m'(4—/? 1 ) + 9/? 1 — 12} 

where m ' is found from the expression m and jB 2 given above 
or by some other trial method 
An alternative is to find from the criteria or from the 
diagram m Tables for Statisticians the value of /? 2 which is the 
consequence of the particular value of /? x when a Type VIII 
curve occurs, and use this theoretical value in finding m mstead 
of the /? 2 given by the actual statistics. 


Example 


Frequency 

Graduation (1) 

Graduation (2) 

469 

437 

436 

186 

222 

209 

166 

165 

161 

134 

136 

141 

122 

120 

127 

^112 

109 

115 

1,189 

1,189 

1,189 


The mean is -65518 of an interval after the centre of the 186 
group The constants were 


/* 2 = 2*986 /?!= *408 

pc 3 = 3 295 J3 2 = 2 047 

/* 4 = 18*252 A = — *05 
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5/? 2 — 6/? x — 9 negative Hence Type VIII can be used. 
m = -500 

, a= -5-797 

y 0 = 102-6 

Tlie curve runs from -277 before the middle! of the first to 
•02 after the end of the last group. The graduation is shown 
(No. 1) above. 

The areas can be calculated by the expression 
y_ r (a-r)J(l-m) 


which gives the area of the remainder from — r to - a. In the 
particular case the range could be fixed at 6 as the data related 
to six months’ experience of maturities among endowment 
assurances, and remembering that the mean is 


we found 


a(l — m)/(2— to) 
m = -439 y 0 — 111-1 


The areas resulting are given in graduation (2). The following 
table gives the calculation of the areas in this case. The equa- 
tion to the curve is 


y 




439 


with range from 0 to 6, a being negative because }i z is positive. 


X 

(1) 

1 — z/ii 

(2) 

Colog (2) 

(3) 

(3) x wi 

(4) 

(4)+loglll 1 
““log?/,,, 

(5) 

LogJ'S- 
h l~m 

(0) 

Antilog (0) 

( 7 ) 

Kemamclei 
,c of range 

(8) 

(7)x(8) 

(9) 

Aiea 

re- 

quired 

(10) 

0 







<> 

1,189 

115 

1 

8333 

0792 

0348 

2 0804 

2 3318 

214 7 

5 

1,074 

127 

2 

6Gb7 

1701 

0771 

2 1230 

2 3744 

230 8 

4 

947 

141 

3 

5000 

3010 

1323 

21779 

2 4293 

208 7 

3 

806 

161 

4 

3333 

•4815 

2118 

2 2574 

2 5088 

322 7 

2 

645 

209 

5 

1667 

•7781 

3419 

2 3875 

2 6389 

435 5 

1 

436 

436 


Some of the columns can be dispensed with, they are shown 
in detail to make the method clear. 

Both graduations are reasonably close to the facts. 
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An example of the limiting case will be found in the following 
statistics: 


No 

Frequency 

Graduation 

Theoretical 

1 

45 

42 

45 

2 „ 

38 

45 

45 

3 

46 

45 

45 

4 

53 

45 

45 

5 

43 

45 

45 

6 

38 

45 

45 

7 

49 

45 

45 

8 

41 

45 

45 

9 

44 

45 

45 

10 

52 J 

► 48 

45 


449 

450 

450 


The mean is *57 after the middle of the 5 group, the moments 

are /i 2 = 8 374 J3 1 = -Oil 

/i 8 = -026 /? 2 = 1-78 

= 124*46 

Hm “ »-S 

The range is from *57 to 10 43 

The series was found by summmg m tens the last figure of 
Carlisle 3| per cent. Table of A x and the mean should be at 5-5 
theoretically instead of 5*57 and y should be 45 The range 
should be *5 to 10*5. The example is interesting as showing how 
the Pearson-curves graduate m an extreme case. The “gradua- 
tion 55 and theoretical results are shown. In the “graduation 55 
decimals have been neglected. 

Type IX 

rti 

1+ ») 

Range from x = — a where y = 0 to x = 0 where y — y 0 

«_+„(»+ 2 ) y(2£ |) 
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m is found by solving 

m 3 (/?! - 4) + m 2 (9/? x - 12) + 24m/? x + 16/? x = 0 
r y = ^(m+ l)/a 


The distance of the mean from x — — a is a(m + 1 ) /(m + 2) and 
from # = 0 is —a/(m + 2) r 

If we use the form with origin at mean (see Table VI, facing 

N(m + l) m+1 


p. 51), A — (m + 1 )a/(m + 2) and y e 


a(m+ e 2) m 


As m Type VIII, the value of m can be found by simplifying 
the cubic into a quadratic, or by the other method indicated. 

The criteria are reached through the same equation as those 
for Type VIII, and can be reduced to ( 1) special case of Type I, 
(2) A = 0, (3) 5jS 2 — 6y? x — 9 is positive, (4) 2/5 2 — S/3 l ~6 is 
negative. 

If y? 2 = 2 4 and = *32, the curve becomes a sloping line 



If = 0 we reach a horizontal line as the limit, while if 
fi 2 = 9 and /? x = 4, we have the other limit of Type IX, and 
find the exponential series (Type X). 


Example 


Duration 

Exposed to 
risk m 
annuity 
experience 

Type IX 

Frequency 

line 

0 

119 

118 

108 

1 

100 

98 

97 

2 

86 

85 

86 

3 

75 

74 

76 

4 

61 

63 

65 

5 

50 

52 

54 

6 

39 

41 

43 

7 

27 

30 

32 

8 

22 

20 

22 

9 

12 

11 

11 

10 

3 

2 

0 " 


594 

594 

594 
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The mean is at 2-909 assuming the exposed to risk to be an 
ordinate at the duration or an area from n — | to n + 

/* a = 6 27 /? 1= = 490 

ft 3 = 10-99 /? 2 - 2-606 

/V= 102-50 

5/? 2 — 6y? x — 9 is positive 2/? 2 — 3 jS ± — 6 is negative 

The curve is not far from Type IX, and if J3 ± had been *32 and 
y? 2 had been 2 4, we should have reached a straight line 

with range from — 6 to 10 0 and obtained the graduation 
shown The whole area — *6 to + -5 is taken as the frequency 
for duration 0 Using Type IX, the following constants are 
reached 

m = 1-123, a =—10 913, y 0 = 115-54 

The curve runs from —-586 to 10 3275 The 118 m the first 
group has been taken as the area from — -586 to + 5 Theo- 
retically there cannot be an exposure before duration — 5, but 
as we are merely giving an example of fitting a curve to a series 
of numbers this need not concern us The difficulty could be 
met by fitting a system of ordinates or by assuming a starting 
point for the curve. 

If m happens to be less than umty the shape of the curve is 

^ f 3C \ 

somewhat different, e.g if y = 100 (l + — I we have the 
following ordinates: 

100, 98, 95, 91, 88, 84, 79, 74, 67, 56, 0 

The actual deaths m a select mortality experience may take 
this form, but the shape of the curve will be less flat at the start, 
eg m the American Medico -Actuarial experience 1913 age 
group 30-34. 
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TypeX 



Range from 0 to oo. 

Distance of origin from the mean is or. 
The ordmate at the mean (y e ) is N jeer 


The curve is a special case of Type III when ya = 0, that is 
when A = 4 

The condition for Type III is given by 2/3% = 6 + 3 J5 V Hence 
the exponential form is given by jS x = 4 and /? 2 = 9. The curve 
is also the limit of Types IX and XI. 


j Example 



Frequency 

Graduation 

Theoretical 

1 

4,165 

4,132 

4,096 

2 

2,028 

2,016 

2,048 

3 

982 

1,015 

1,024 

512 

4 

480 

511 

5 

26(5 

257 

256 

0 

132 

130 

128 

7 

71 

65 

64 

8 

36 

33 

32 

9 

17 

17 

16 

10 

9 

8 

8 

11 

2 

4 

4 

12 

1 

2 

2 

13 

1 

1 

1 

14 

1 

1 

1 

15 

1 


r 


8,192 

8,192 

8,192 


The unadjusted mean is 2*0087. 

jlc 2 = 2-045 = 4*629 

« 6 290 /? 2 = 9 502 

^ = 39-720 cr « 1-43 

When the curve is an exponential the moments and mean 
require adjustment, but the Sheppard high contact adjust- 
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ments are, of course, unsuitable. If the curve starts at the 
beginning of the first group, I think that the mean is overstated 
when fi z is positive by l/12<x approximately,, and the second 
moment about the true mean is understated by approxi- 
mately.* Making use of the adjustments cfhe mean is now 
1-934, ju, 2 is 2 123, and cr is 1 457. 

The statistics relate to sequences m com-tossing and the 
theoretical figures are added In the statistics as published the 
sequences of 11, 12, etc. were 2, 1, 0, 1, 0, 2 Strictly speaking 
we are dealing with a system of ordinates , I made the calcula- 
tion as a series of areas m order to introduce the adjustment of 
moments. In calculating the graduated areas of the curve it is 
useful to remember that the area from a to b is {y a —y^) <r. 

It is interesting to notice how the “ graduation 55 keeps 
closer to the frequency than the theoretical result. 

I give as a second example ? the following series based on 
cricket scores known to start at the beginning of the first 
group* 


Score 

0-19 

20- 

40- 

60- 

80- 

100- 

120 - 

140- 

160- 

Series 

64 

34 

18 

9 

6 

3 

3 

0 

0 

Graduation 

64 

34 

18 

10 

5 

3 

1 

1 ! 

1 


The ratio of each term to the preceding is -54, and the gradua- 
tion is almost exact. Owing, however, to the 3 at the group 
120, the moments give a criterion considerably removed from 
the theoretical = 4, j3 2 = 9. 

If we had assumed the start of the curve m the previous 
example, we should have reproduced the theoretical result 
almost exactly. 

* See, however, general discussion in Appendix I 
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Type XI 

y = Vo x ~ m 

Range from x = b where y = y 0 b~ m to x = oo where y — 0 

m is found from <- 

m 3 (4-/? 1 ) + m 2 (9/? 1 -12)-24/? 1 m+16/? 1 = 0 

7 / m /m — 3 

b = +cr(m — 2) / 

V m- 1 - 

2/o = Nb m ~' 1 ^n— 1) 

The distance of mean from origin is b(m— l)/(m — 2). 

If we use the form with origin at mean (see Table VI, facing 
p. 51), A = &(m — l)/(m — 2) and 

_ V (m — 2) m 

Va ~ b ' (m — l) m-1 

As m Type VIII, m can be found by simplifying the cubic 
into a quadratic or by the other method indicated. 

m may have any value from 5 to oo, but m practice its value 
is not less than 9 

The curve is a special case of Type VI when # 2 — 0 
The criteria can be expressed as (1) special case of Type VI, 
(2) A = 0, (3) 2/i 2 — 3/? x - 6 is positive 


Example 


Duration 

Withdrawals 

Giaduation 
by XI 
* 

0 

165 

183 

1 

65 

53 9 

2 

23 

32 6 

3 

32 

20 0 

4 

13 

12 4 

5 

8 

7 6 

6 

1 

49 

7 

6 

3 1 

8 

3 

1 9 

9 

3 

12 

10 

1 

8 

11 

3 

1 6 


323 

323 
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I have not come across a distribution really represented by 
this type, but I give an unsuccessful attempt to apply it to a 
series of withdrawals. The constants were 

= 4*97 b = 57*14 

m -~v 29 69 log?/ 0 — 54 3563 

Distance of mean from origin = 59*205 

In calculating areas we use 2/ 0 a- (m_1) /(m - 1) as the area from 
a to oo 

Twisted J -shaped Curve 

As pointed out m the notes on Type I (p. 59), we obtain an 
interesting curve when both m 1 and m 2 are numerically less 
than unity and one of them is negative. It arises when 

/? 2 >1 5+1 125/?! 

and when /? 2 < 2 + 1*25^ 

as can be seen by remembering that the sum of the values of the 
m’s must lie between 1 and — 1 or r lies between 3 and 1 A 
special case has been discussed as a transition type (No XII) 
when 

_ „ MV(3+A)+vA}+a yw3 + A)} 

0 W{V(3 + A) — VA } — x ) 

Range from x = cr(^( 3+A) — VA.) to * = -cr(V( 3 + A) + \'A) 

The origin is at the mean. 

^ . N 

~ br{m+l)r(l-m) 

where m = J and b = 2<r^/(3 + A) 

When /< 3 is positive, the negative sign is taken for the square 
roots 

The hmit of the curve when = 0 is a horizontal hne. 
The criterion is 5/? 2 — 6/? x — 9 = 0 
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Example 


Frequency 

Graduation 

Ordinates 

r 

2 

18 

33 

31 

31 6 

40 6 

53 

49 

49 3 

57 0 

65 

65 

65 2 

73"4 

81 

83 

82 4 

r 

92 3 

101 

103 

103 3 

114 6 

131 

134 

132 9 

155 5 

186 

191 

186 0 

244 2 

350 

342 

405 4 

1,000 

1,000 



The mean is -051 after the centre of the 131 group. The 
constants are 

/i z = 4-266 A = '761 

/t 3 = — 7-688 A = 2 ‘ 646 

{h = 48-154 5A-6A- 9 = -' 368 

y = 87 2{(5-808 + *)/(2 204-*)} 45 

In addition to the graduation a number of equidistant 
ordinates is given. They show that the curve rises abruptly, 
then less abruptly and then again more abruptly. The with- 
drawals m select tables are sometimes of this shape (e g 
Japanese experience, 1910, age 52, females). A somewhat 
similar twist occurs in a population curve 


U -shaped Curve 

This shape arises in Type I when m 1 and m 2 (or m in Type II) 
are negative There are difficulties in fitting it to statistics 
because it is awkward to adjust the rough moments The 
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figures given m the table of examples (p. 101, col, 6) were 
found by calculating the areas of the curve 

-4+rnr • 

The limit of the U -curve is two separate blocks of frequency 
at the ends of the range. This limit is reached when 

/^2~ 1 = o. 

Some of the curves with which we have dealt are rare and m 
practical curve-fitting may be avoided, for they depend on 
certain definite values of /? 2 and /? 2 , and the chance of leaching 
these exact values is negligible In other words, if the object of 
fitting a curve m any particular case is to obtain the closest 
agreement between the actual figures given and the graduated 
figures, then the mam Types (I, IV and VI) are all that are 
necessary, for the other types being transition types and 
depending on specific values of and /? 2 need not arise If, 
however, our object is to study probability m a wider sense, the 
transition types are of importance and they may, of course, be 
properly used when the values of the /?’ s only differ from those 
indicated by the criteria to a small extent. This “ small 
extent 5 ’ means (as we shall see later) within the limit suggested 
by the standard errors of the s. 

ADDITIONAL EXAMPLES 

1 . Up to the present we have merely considered examples 
with a view to illustrating the various types of frequency- 
curves, but it seems advisable to consider one or two practical 
examples which may help to show the range of applicability 
of the curves m actuarial work, and give an opportunity of 
noticing a few difficulties which may arise m applying them. 

The function with which actuaries generally wish to deal in 
practical work is not an exposed to risk or series of deaths or 
withdrawals, but the ratio between the deaths and the ex- 
posed, that is, with the rates of mortahty, sickness, marriage, 
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and withdrawal. An actuary studying frequency-curves may 
therefore naturally ask whether any of these rates can be 
graduated by means of the curves we have examined, and, if 
they fail, must they be put aside for some other method ? Now 
the first point to be considered is whether these rates are 
frequency distributions, if they are not, the r use of the fre- 
quency-curve is empirical A rate of mortality gives the pro- 
portion of people at each age who die, and if we imagine 1,000 
persons exposed to risk at each integral age, the number of 
deaths would be 1,000 times the rate of mortality, and this 
seems to show that it is possible to consider the rate of 
mortahty as a distribution, though it is one that could hardly 
arise m actual experience It is impossible to describe the 
rates of mortality or sickness by a single frequency-curve. On 
the other hand, the rates of marriage are certainly much like 
frequency-curves, and the rates of withdrawal, whether re- 
garded according to age or duration, might take a form like our 
example in Type III There are, however, practical objections 
to the direct operation on rates, even apart from the very 
exaggerated idea of frequency distributions m which it is 
necessary to indulge The numbers exposed to risk at the end 
of any table become small, and a single death or marriage there 
gives a very large rate, while at several ages near there may be 
a zero rate shown by the ungraduated data This is extremely 
awkward, as it tends to make the ratios dealt with far rougher 
in application than the actual observations are m fact, and we 
are forced to group the material before using it, which intro- 
duces an arbitrary practice which it is well to avoid so far as 
possible. It must not, of course, be inferred that a small 
number of say fifty or one hundred deaths must necessarily be 
grouped according to each year of age, but that even if there 
are two or three thousand the roughnesses introduced by the 
use of rates influence the result considerably. Graduating rates 
means that an equal weight is given to each rate of mortahty 
which is far from the weight indicated by the exposed to risk. 

2. It will be useful to consider a case bearing out these 
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objections and then deal with a practical method of overcoming 
them The statistics to be considered have been taken from a 
paper by M. Mackenzie Lees “On Rates of Mortality and 
Marriage among daughters of Peers and HerrS Apparent, etc.” 
(Trans. Fac Actu.i, 276), and may be summarised as on p. 117. 

The moments were calculated by the Summation Method, 
and were found, about the mean 2877191, to be 

fa = 63-2092 J3 X = 1-557153 

fa = 627-101 /? 2 = 4-781321 

fa — 19,103-3 , 

The criterion was k = — 1-5, but as I had neglected the rate 
•0089 at 71 m calculating the moments, I used Type III. The 
inclusion of the rate at that age would have lengthened the 
curve and considerably mcreased the arithmetical value of the 
criterion Moreover 2/? 2 approximates to 6 + 3 /3 V 

The constants for Type III were 

y = -201592 a = 7-78189 

p = 1-56881 Mode = 23 81128 

The curve starts, therefore, at age 16-02939. 

y Q = 890 05. 

The rates resulting from this graduation are given m the 
table, and while they tend to show that the distribution of rates 
of marriage is closely allied to a frequency-curve, they do not 
give a satisfactory graduation, and the failure is due almost 
entirely to the objections mentioned above If we were exam- 
ining the algebraic form taken by rates of marriage, we should 
begin by work on population data where the roughness of 
material is avoided by the large numbers of individuals dealt 
with; as, however, we are seeking for a graduation, we must see 
how these objections, which of course apply to some extent to 
any method of graduation, can be overcome It has been 
remarked that the cause of the difficulty is that incorrect 
weights are given to the items used, and the most obvious 
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suggestion is that the actual exposed and marriages should be 
graduated separately. This, however, entails a large amount of 
additional work and seems to overlook the fact that deviations 
in the exposed to risk and the marriages are not independent. 
A shorter method can be used which avoids both the double 
graduation and tlib error just indicated This method consists 
of using a series allied to the exposed, and treating it as a 
hypothetical exposed to risk from which a new series of 
marriages can be calculated The advantages are that we have 
only to make one graduation, and the weights of the various 
parts of the table are given approximately In a similar way 
q x can be graduated, and m this connection it may be remarked 
that as the exposed to risk is generally capable of being repre- 
sented by a frequency-curve, it is natural to suggest that the 
hypothetical exposed might be taken as the simplest form 
assumed by such curves (normal curve), this is also convenient 
because the ordinates for such curves have been tabulated. 

3. The hypothetical exposed can be fixed by trial or from the 
values of the exposed The column E' x m the table given on p 
117 is taken from Sheppard's Tables in Tables for Statisticians , 
x being taken as 3 06, 3 084, 3T08, 3 132, etc , and the entries 
were multiplied by 10°. M f t = xm x was then formed and 
graduated. The following values were obtained for the M' x 
senes. Mean _ 24 85779 ^ = 1-40775 

// 2 = 29 5006 0, 2 = 5-01114 

ji 3 = 190-112 /c =-7 102 

Pi = 4,361-12 

As k is large, Type III was used, and 

y = -310350 y 0 = 192-625 

p = 1-841405 Mode = 21-63562 

a = 5 933325 

The curve was then worked out and the rates of marriage 
m the final column were obtained by dividing M' by E' They 
agree closely with the ungraduated figures 
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4. A numerical example of the application of the method to 
the 0 NM(5) Table may now be given The normal curve with 
or = 10 and origfn at age 52 } was used, and the values were 
multiplied by q x with the help of Crelle’s Tables. 

A part of the w*ork was r 


q x E x 10 5 

Age 

Ordinate from 
Sheppard’s Tables 

Age 

q x E x 10 5 

810 

52 

3984439 

(T 

53 

801 

597 

51 

3944793 

54 

850 

644 

50 

3866S81 

55 

875 


Summing these entries (qx Ex 10 5 ) in fives, I formed the 
following 


Age 

q xE x 10 5 

20 

13 

25 

70 

30 

218 

35 

594 

40 

1,394 

45 

2,460 

50 

3,702 

55 

4,519 

60 

4,385 

65 

3,602 

70 

2,249 

75 

1,197 

80 

461 

85 

133 

90 

31 

95 

5 

100 

1 


25,034 r 


The abbreviations (use of Crelle’s Tables and grouping) 
were adopted to save labour, and as the figures were required 
for an example they are sufficiently accurate 
The following values were then found 

Mean age = 59-439762 
/q> = 4-584327 
= - -4999871 
/* 4 = 61-17014 
( ix8 ) 





Type of curve — -No I. 
m x = 32-81166 

m 2 = 26*57123 

= 18 78553 

a 2 = 15*21272 , 

y 0 = 4609*884 
Mode age = 59*730789 


(The unit is 5 .years of age.) 

The ordinates were then calculated for every fifth age, and 
finding that the curve is not very far removed from the normal 
curve of error, I interpolated m the second differences of the 
logarithms of the ordinates for those at the other ages * A 
quadrature formula was used for finding areas, and q x was 
found by dividing by the hypothetical figure already used for 
the exposed. 

The expected deaths were as follows* 


Group 

Graduated q x 
for central age 
of group 

Actual 

Expected 

Deviation- 

+ 


15-19 



1 5 

15 


20- 

00643 

9 

89 


1 

25- 

00731 

69 

61 0 


80 

30- 

00850 

205 

204 6 


4 

35- 

00991 

369 

380 7 

11 7 


40- 

01179 

588 

575 6 


12 4 

45- 

01452 

801 

8114 

10 4 


50- 

01866 

1,064 

1,063 8 


•2 

55- 

02505 

1,399 

1,386 6 


12 4 

60- 

03516 

1,752 

1,773 2 

212 


65- 

05118 

2,164 

2,136 7 


27 3 

70- 

07682 

2,216 

2,261 2 

45 2 


75- 

11648 

1,965 

1,925 8 


39 2 

80- 

17462 

1,237 

1,241 9 

49 


85- 

24870 

494 

514 4 

20 4 


90- 

33286 

129 

126 0 


30 

95- 

43289 

18 

17 3 


7 

100- 


1 

15 

5 

* 



14,480 

14,492-1 

115 8 

103 7 





219 5 


* As e^ x ~ h ^ !2<yt is the equation to normal curve, the logarithm is Ax 2 +Bx + C , 
say The criterion shows if the curve is nearly normal. 
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5. It will be interesting to examine a particular case of the 
method just described, as it is often required by actuaries 
Defining Mabeham’s hypothesis as cologp^ = A + Bc x , we 
take a normal curve (y 0 er^'~ h)2 i 2(r2 ) to represent the exposed and 
multiply by the values of colog p x This means^that we assume 
that the products can be represented by 

y - (A + Bc x )y 0 e^ x ~ h ^ 2 

r — jly Qe ~(x-h)*l2<r2 gy^ e --(xZ-2hx+h 2 -2cr z $10g e c)l2 c r2 

— Ay 0 e - (x ~ 1l) ^ 2cr2 + H By ^-^ 2 - 2 [h+(x2 10 ^ io g e cm<r * 
where H ~ ^ 7i2 + 2 ^ log ^+° 4 ( 1 °Se c ) 2 -~^ 2 )/ 2or2 = e/nog 6 c-t-~(iog e c) 2 

y = Ay 0 e~^~ h ^ 2 + HBy 0 e -(^)W (I) 


i e the sum of two normal curves both having the same 
standard deviation as the exposed curve and one having the 
same origin 

The difference between the two means gives cr 2 log e c, so 
log 10 c = — 2 - log 10 e 

The whole solution is made very simple by taking moments 

/M- oo /*-foo 

about the known origin (age h), for xydx and x 2 ydx 

J —00 J —00 

(the first two moments) give 

(t—h)N%* and N 1 cr 2 + N 2 {cr 2 + (t — A) 2 }| 


where N x = Ay 0 <r^J(27r) and N 2 = H By 0 cr *J(27r) 

Dividing the values just given by N 1 + N 2 (the total fre- 
quency), we obtain, as the first moment about the known 

origin, — and, as the second, 

.iv jy 2 


N^+N^+N^t-h) 2 


cr 2 + (t — h) 


5,4 Remember that the normal cuive is symmetrical, so that the odd moments 
about the mean of such a curve are zero 
f Can be seen at once as the sum of two integrals, -ZVjO* 2 gives the second 
moment of the first normal curve m (I), and N 2 {cr* + (£ -h) 2 } gives the second 
moment of the second normal curve 
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or 


t — h = 


IH—a_ 

& 


and 


-kj _ /4.(Ni + N 2 ) 
2 t-h 


where n' is written for moments about h 
As stated above 


and if y 0 = 


and 


10 fc 

crj(2n) 


, t-h 

logioc = -^rlog 10 e 


as is generally convenient, then 


(II') 


a = ivyio* 

B = IV 2 /(10* x H) 
N, 1 


10 * 


gHogeC+y (l0g,C)> 


A, 


(t-h) , 

10 k c h e~ l0SeC 


(see equation (II)) 


N, 


t+h 
lO k G 2 


Care is necessary with regard to the value used for y 0 , and 
consequently with regard to A and B If Sheppard’s tables m 
Tables for Statisticians of ordinates (z) be multiplied by, say, 
10 5 and used as^ the exposed to risk, the values of A and B 
resulting from the work will be iV r 1 /( 1 0 5 cr) and N 2 l(10 5 Hcr) 
The reason is that his tables are m terms of standard deviation 
6. If we assume, as Hardy did when graduatmg the British 
Offices 1863-1893 experience, that log 10 c is known, we only 


require to calculate one moment which gives us 


{t-h) N 2 


, and 


this, with the help of equation (II), enables us to complete the 
solution If c were obtained for the aggregate table, we should 
use this result for the select tables. 
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7. A numerical example with the O mm Table may be of 
interest. A normal curve with standard deviation 10 and 
origin 55} was taken, and the terms multiplied by colog^ 
These were then grouped m fives, and the first two moments 
calculated about age 55}. One little point should be borne m 
mind in connection with the grouping, though the centre of the 
base on which the product (q x x exposed) stands is x -f }, the 
result (colog^x exposed) is an ordinate at x 9 the centre 
poiht of five ages 20 to 24 is 22} when q x is u&ed and 22 when 
colog p x is used # 

The figures were 

N x ^N 2 = 136387 

1st moment about 55} m 5-years unit = 1*416184 

2nd „ „ „ = 4*1929354 

Deducting Sheppard’s adjustment of from the second 
moment* and multiplying the first moment and the adjusted 
second moment by 5 and 25 respectively to make the unit one 
year instead of 5 years, we have 

= 7*080920 

/4 = 164 384085 

then log (i t — h)=* *9586889 

t-h = 9*092617 

log 10 c = *03948873 

A = *00301749 # 

B = *00004518782 

log 10 J? - 5 6550214 

q x was then calculated from the graduated colog p x obtained 
from the values of A, B and c, and the following table of 

* As we are dealing with the sum of five ordinates m each group and not 
with an area, we should not, strictly speaking, use Sheppard’s adjustment, but 
should deduct 08 The difference is small and the constants have not been re- 
calculated The formulae would be - 08 and /6 4 =r 4 - 48r a + 02752 where 
p is adjusted and v unadjusted 
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expected deaths was worked out. The values of q x are given in 
the table showing the frequency-curve graduation* 


Age 

group 

Graduated q x 
for central age 
of^group 

Expected 

deaths 

Actual 

deaths 

Deviation 

1 + 

- 

Under 25 


13 0 

9 

40 


25-29 

00812 

67 0 

69 


20 

30- 

00882 

211 6 

205 

66 


35- 

00991 

380 8 

369 

11 8 


40- 

01162 

566 9 

588 


2i*r 

45- 

01431 

799 7 

801 


1*3 

50- 

01854 

1,057 § 

1,064 


65 

55- 

02517 

1,392 7 

1,399 


63 

60- 

03551 

1,790 2 

1,752 

38 2 


65- 

05160 

2,153 0 

2,164 


110 

70- 

07639 

2,249 3 

2,216 

33 3 


75- 

11415 

1,888 7 

1,965 


76 3 

80- 

17053 

1,213 6 

1,237 


23 4 

85- 

23352 

519 1 

494 

25 1 


90- 

36484 

136 6 

129 

76 


95- 


20 6 

19 

16 




14,460 3 

14,480 

128 2 

147 9 





276 1 


This result is very like that given by the late Sir G. F. Hardy, 
but avoids having to obtain c by trial Hardy’s expected and 
actual deaths balance better than the above, but I do not think 
the rates have been understated systematically, the 75-79 
group accounts for the disagreement The total deviation is less 
than Hardy’s 

8. Another possible application of frequency-curves to life 
assurance and nfortahty statistics was discussed recently. The 
exposed to risk or the amount of the sums assured or premiums 
at each age can usually be graduated by a frequency-curve. 
When an actuary values the liabilities of an insurance company 
he works, in effect, on the proportion of the business that 
survives to each age m successive years according to the 
mortality table assumed in the valuation. If the proportion, 
at age x , that survives n years by a given table of mortahty is 
n p x and if E x is the amount of sums assured, say, on the books 
at age x, then the amount of sums assured surviving after n 
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years is E x n p x For diverse mortality tables, various values 
of n and a fairly wide range of frequency-curves assumed for 
E x , we again resell a frequency-curve as an approximation to 
the distribution of E x n p x m terms of x Several statistical 
examples have b§en given* and the reader who wishes to 
examine other examples than those given m this book may 
refer to them or to such a large collection of examples as those 
given by K Pearson and A. Lee for Barometric Heights j 
9. If we know the range of a curve we nedd not even with 
Type I find as many as four foments, for the equations on 
p. 64, giving the moments about the start of the curve, afford a 
simple solution. We have 


/4 


b(m x + 1) 

~~ I I O 


and 


b 2 (m 1 + 1) (mi + 2) 


and writing 


we have 

and 


Vi 


fi x 


and y 2 


fi o 


m 1 + 1 


ri(y 2 -i) 

Jx-Jz 


m*> + 1 


(y 2 -i)(i-yi) 

yi-y 2 


where p! is written for a moment about the start of the 
curve. 

10. If, however, we can only fix by general considerations 
the start of the curve, the following solution depending on 
three moments is of use. 


Writing 



and A 3 


j4_ 

/^2 M'l 


the values of the constants m the equation to the curve are 
given by 

2(A 2 — A 3 ) 


m x +l = 


2Aq Ap ApA' 


2 /v -3 


•j Philos Trans A, cxc, 423 
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and 


m ' 1 — _ ^(^2 ^ 3 ) (^-3 ~~ jj ( * ~~ jjg) 

( 2 A 3 — A 2 — A 2 A 3 ) ( 1 4 - A 3 — 2A 2 ) 

6 =/M '!^i±3 + 2 

^ m x + l 

®l/^2 — 

1 1 . We may return to Type I for an example of the method, 
of § 9 where we will assume that the curve starts at age 17-5 and 
has a range of 1 5 5 units Considering the lin e for age 22 m the 
table on p 60 we see that 4-175 and 14 634 give S 2 and S 3 , 
excluding the first group, and the moments about age 1 7 are 
then found to be 4-175 and 25 093, transferr ing to 17-5, we 
have 4 075 and 24 268, adding the moments for the first group, 
•034 x £ and -034 x (l-) 2 respectively, y ' = 4 0818 and 

/4 = 24-26936 

Hence m x = -3498 a x - 1 735 

m 2 = 2 7758 a 2 = 13 765 

y 0 = 154 2 

and the mode is 17 5 + 1 735 x 5 = 26 175 

From these values the graduated figures for the first four 
groups are 37, 140, 152, 143 

12. It may be of help to give another example of a J -shaped 
curve and we take the first example of Table I for which 
the mean is at duration 4 182, and the moments and 
constants are * 


fi 2 = 17 63688 


3 34846 

fi s = 135 5361 

A ~ 

6 18392 

Pi = 1923 565 

K = 

-1 307 


so that the curve will be of Type I and equation to it is 
y = -89082a; - 629685 (25 4 9 7 29 -x ) 1 624275 
the ongm bemg at 1-02897 where the curve starts. 
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The graduation by this curve is shown m the following table 


Duration 

* 

Withdrawals 

Graduated by 
Type I curve 

1 

308 

312 

2 

200 

198 

V 

118 

101 

4 

69 

76 r 

5 

59 

58 

6 

44 

45 

7 

29 

37 

8 

28 

30 

9 

26 

25 

10 

sfi 

21 

11 

18 

18 

12 

18 

15 

13 

12 

13 

14 

11 

11 

15 

5 

9 

16 

11 

7 

17 

7 

6 

18 

6 

5 

19 

1 

4 

20 

3 

3 

21 

1 

2 

22 

3 

2 

23 

2 

1 

24 


1 


1,000 

1,000 


13. The calculation of the graduated area of the first group 
may present a difficulty, as a quadrature formula cannot be 
applied, and the following method gives the best way of 
obtaimng a correct value r 

| y^x^b — x) m *dx 

Jo . 

= j X y' Q x m ^b m 2 -m. 2 b™*- 1 x + — b m ^~ 2 x 2 -. ylx 

, , / 1 m*x \ 

" M^ _ S(Sf+2) + "') 

which is a rapidly convergent series when x is small In the 
preceding example, where x is 1-5 — 1-02897 = 47103, the 
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second term barely affects the result. y' Q must be calculated 
by the formula 

N r(r) 

6 w i +m a+i r(m 1 + 1) _T(m 2 + 1) • 

The expression for finding the area of the first group in 
Type III curves is 

• • 

where y' 0 — Ny p+ 1 jr(p + 1) 



CHAPTER VI 


r 

COMPARISON OF VARIOUS SYSTEMS 
r OF CURVES 

1 . In the previous chapter we dealt with Pearson’s system 
of frequency-curves, but other methods have been used to 
describe frequency distributions We have already seen that 
Pearson’s system of curves describes the facts that have been 
collected about a variety of subjects connected with chance. 
A system is useless if it does not give approximately the 
distributions that actually occur The binomial series is justi- 
fied from this point of view as a description of the number of 
times events happen, because we have found from experience 
that the numbers given by it are realised approximately by 
trial When we consider the matter we are almost compelled 
to admit that the real justification of any theory of probability 
is that events happen in the way such a theory leads one to 
expect, and if we wish to compare the systems of frequency- 
curves that have been suggested m recent years, it should be 
done not so much by examining the ways m which they have 
been derived as by seeing what classes of distribution they 
represent and by noticing carefully the cases of failure and the 
difficulties of application 

2. As we know from experience that the binomial series 
actually represents a simple type of probability, it is natural to 
start from it and treat it, or its limits, as a part of any system, 
it must, m fact, be a special case of any more general type that 
may be evolved. 

We can proceed either by building up a curve on assump- 
tions which it seems natural to adopt or by taking a more 
complex series than the binomial (e.g the hypergeometncal) 
and m either case an expression might be reached having 
greater generality than the binomial. But it must be remem- 
bered that the ultimate justification of any evolved formula 
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rests mainly on its breadth of application to statistics which 
may reasonably be described as chance distributions. Such 
application is an important test of the fundamental assump- 
tions that were adopted m reaching the formula, for it must be 
admitted that the plausibility of the initial statements would 
be poor defence* of a curve which broke down whenever it was 
put to a practical test 

The well-known “ normal curve of error”, with which we 
dealt on p 80,^ was a first step towards fin din g a simple 
frequency-curve, but though it works well as a description of 
the binomial (p + q) n when p Is approximately equal to q or 
when n is large, it is unsatisfactory m other cases In actuarial 
work these cases frequently arise At the ages attained by the 
majority of lives assured m any assurance office the rate of 
mortahty or probability of a person dying m a year is small 
and the frequency distribution giving the number of deaths 
happemng m successive years out of 50 cases, say, when 
q = 02 and p = *98, would not be satisfactorily described by 
the normal curve of error It is true m a sense that the 
“normal curve” is a law of great numbers, but if it can only 
deal with cases resting on such a basis it cannot have a large 
sphere of action m practical statistics and it can hardly be 
expected to be of value when a series is more like the hyper- 
geometrical than the binomial. 

3 . It is this failure of the 4 ‘ normal curve ’ 5 that has led to the 
work of Pearson, Thiele, Charher, Edgeworth, Bruns, Kapteyn 
and others, and the curves suggested by these writers are of 
considerable interest to all students of statistical mathematics 
In this chapter we shall indicate how far some of these curves 
fit the statistics that arise m practice, how far, m fact, they 
graduate the rough figures obtained from the collected facts, 
and where they break down. 

Before proceeding, however, it will be necessary to discuss 
briefly the suggested types. We may also mention an old 
difficulty m practical work of this nature, namely, that 
statistics are seldom obtained from strictly homogeneous 
material. This fact must be taken as one of the typical elements 
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m practice, and if a series can graduate m spite of a small 
amount of heterogeneity it is, from some points of view, all the 
more valuable m much of the work that comes to the hands of 

c 

an actuary or statistician. 

4. We may now turn to an expression which we will call 
Type A, namely r 


F(x) = 




1 C d n 

where <f> 0 (x) = and W*) = orW ^^ o(*) 

So that, if <7 = 1, i.e if we measure m terms of the standard 
deviation, 

<p s (x) = (3# — x z ) <fi 0 (x) 

$J 4 (o?) = (a? 4 — 6cr 2 + 3) <fi 0 (x) 

<j) b {x) — ( — a; 5 4- 10# 3 — 15a?) 


In applying these expressions x is measured throughout from 
the mean m terms of the standard deviation the measures 
used m Tables for Statisticians It may be mentioned that the 
coefficients m round brackets m the equation for Type A as 
set out above are the third, fourth and fifth semi-invariants 
In Tables for Statisticians (Pt n, Tables v-vii) 


T n +lW 


(“ 1 )" 


d a 

dh n 


/_r_ e -ft a /2\ 

W 2tt j 


and when using these tables we write F as _ 


-{Ti(7i) + -81649658ViS 1 .T 4 (A)+ 45643546QS 2 - 3)r s (h) + . . } 


This series has been discussed by many writers, especially on 
the Continent*, and it may be regarded as the use of the 

* Gram, Thiele, Charlier, Bruns, etc In a memoir entitled Researches into the 
Theory of Probability (Meddelanden Lunds Astronomiska Observatormm, 1906), 
C V L Charlier gives several numerical examples and many useful notes 
J P Gram, on p 94 of Om Roeklendnklingei , bestemte ved mindste Kvadraters 
Methode (Copenhagen, 1879), says that Oppermann had suggested the formula 
some time before 
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“normal curve” as a generating function. It has, naturally, 
a greater range of applicability than the “normal curve”, 
but it is not of service m the more extremely skew cases, and 
it has been suggested by C V L Charher that, m such 
circumstances, an expression Type B should be used This is 

B\x) = B 0 i/r(x) + + B^i/rfy + . 


where ifr(x) = e~ 


z Sill 7TX ri 

7 T \__Z 1 1 


m 


: + 




(x-l) 2'(x-2) 



and — i/f(x) — ijr(x— 1), le Aijr(x—1) and values of 
for x < 0 are assumed to be zero Similarly ifr^ = Aijr^zly 

In the hmit when m is an integer i/j(x) becomes e~ m m x jxK 
This expression is already well known m the theory of pro- 
bability as Poisson’s series — the “normal curve ” is sometimes 
spoken of as a “law of great numbers” and the Poisson series 
as a “law of small numbers” Type B uses er m m x jx 1 as a 
generating function similarly to the way m which Type A uses 
the “normal curve” 

5. The fitting of Type B presents certain special difficulties 
as alternative methods are available, but we may as a preface 
to them point out that if we fit e~ m m x /x 1 by moments using all 
integral values of x from x = 0 to x — oo we obtain 


ji 2 — m jll z — m = 3ra 2 + m 


or Pi — — 3 — 1/m 

This, however, assumes a system of ordinates, unit distance 
apart, and we know that in practical statistical work these 
assumptions limrt us unduly 
We can, however, write 


F{xw + c) = B 0 i/r(x) + B^Jr^ + f, + . 


which imphes that owing to w we have generahzed the umt of 
grouping and owing to c the point from which x is reckoned is 
also generahsed 

In this form Charlier suggests four methods of fitting and 
remarks that the series usually becomes more convergent if we 
arrange constants so that B x — B 2 — B z ~ 0. 
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(1) Assume w = 1 and c — 0, that is, revert to the original 
form and choose m so that B x vanishes, and since B 0 ~ N we 
can reach 

<tr 

2 'B 2 = N(ju 2 -b) 

3 ' B 3 = r N(-/i 3 + 3 fi 2 - 2 b) 

41 J 3 4 = N({i 4 — 6jli 3 — 6 bju 2 + ll/^ 2 + 3 b 2 — 66) 

where 6 is the distance from the origin to the mean This 
method can he used when we can anticipate that m wall not 
differ greatly from b c 

(2) Assume w — 1 and calculate c as an unknown constant, 
choosing it and m so that B x and B 2 vanish 

c = b-/i 2 31 B 3 = N(p 2 -/i s ) 

m = // 2 B 4 = N(pi 4 — 3 pi\ — 6/^3 + 5 /^ 2 ) 

( 3 ) Find m, w and c so that B x = B 2 — B z — 0 


10 = // a ///' 2 B 0 — N/w 



c = b-fi/fr 


This method usually gives w very small values and m very 
large values when jli 3 vanishes, so it is only applicable m 
markedly skew cases 

( 4 ) Fix c arbitrarily and find m and w so that B x ~ B 2 = 0 
m = (b~-c) 2 lju, 2 * 

iv = fi 2 l(b+-c) 

B 0 — Njw 

w 3 3 t JS 3 = B 0 (w/a 2 -ju, z ) 

B 4 = £ 0 W “ 3 / 6 | + 5 w 2 jlc 2 — 6^3) 

It seems unnecessary to give the work m detail leading up to 
the various sets of equations. Tables of m x \x 1 will be found 
m Tables for Statisticians 
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6. F. Y Edgeworth* has used a series similar to Type A, 
namely 


3 » \dx) » \dx) 


cj> Q {x) 


where Je x , & 2 , etc are the third, fourth, etc semi-mvariants 
Expandmg the exponential, we reach 

VO + 81649658 + •9860133Q(8 1 t 7 (A) + .. 

+ 45643546(j8 2 - 3) r 5 (h) + + etc } 

Arithmetically the difference between this series and Type A 
is usually small Type A does not mclude the r 7 term which 
1 1 1 cZ 6 

arises from lc\ ttt wt wt Later terms would also differ, but 

the expansion shown assumes that we shall not use more 
than four moments and that r 9 etc terms can be ignored. 

7. It is possible to use other expressions, e g Type III, 
instead of a normal curve as a generating function A necessary 
condition for a frequency function is that it must not produce 
negative frequencies and the reader who wishes to pursue this 
part of the subject may be referred to a lecture by Professor 
Steffensen giving an interesting account from first principles f 
For a general discussion of Edgeworth’s and the A series and 
the theory underlying them the reader should study the 
papers to which reference has already been made and also 
Professor H Cramer’s paper “On the composition of ele- 
mentary errors m Skandinamsk AktuanehdsJcrift , 1928, 
p. 13 etc and p 141 etc 

It is not, however, pretended that the curves and series set 
out above exhaust the suggestions that have been made, but 
they may be taken to represent the methods that have 

* Edgeworth contended that his equation was unique m its character and 
theoretical basis It avoids the negative frequencies which may arise with 
Type A and are unjustifiable m theory This last point will be brought out mthe 
numerical examples Trans Camb Phil Soc 1905 (Law of Error), J Roy Statist 
Soc 1906 (Generahsed Law of Error) 

f J E Steffensen, Some recent researches m the Theory of Statistics and Actuana 
Science (Cambridge University Press, 1930, Third Leccure) 

( 133 ) 



received most general support, and the examples we shall give 
do not go beyond them We may, however, mention that it has 
been suggested that graduations should be made by writing 
y = e-iDW This way of using the “normal curve ” has been 
called the “Method of Translation” and m its most general 
form is arbitrary In practice the form of f(x) must be restricted 
and certain special cases have been studied ! ' but the method 
seems to be open both to practical and theoretical ob]ections, 
and it will not be discussed m detail 

8 . Numerical Examples r 

Example I 


(Symmetrical curve not capable of satisfactory gi actuation 
by the normal curve of erroi ) 


Obaei\ationw 

Peai son’s 
Type II 

Type A 

Edgewoith 

Normal curve 

11 

14 

15 

1 . 

20 

116 

109 

106 

106 

95 

27 1 

286 

284 

285 

270 

451 

133 

437 

436 

456 

432 

433 

437 

436 

456 

267 

285 

283 

284 

270 

116 

109 

106 

106 

95 

16 

14 

15 

16 

20 


In this case all the curves except the normal give excellent 
graduations We have not used Type B because Charlier 
apparently only adopts it when Type A is unsuccessful He 
does not give a statistical criterion to show when A or B should 
be used and it is difficult to see how such $ criterion can be 
evolved The solution of his Type A does not lead to imaginary 
quantities when Type B should have been used, m the way that 
Pearson’s Type I, for example, does when it is inapplicable In 
reaching Type A and the Edgeworth graduation we have 
used the terms involving A 4 and k 2 respectively A 4 is used 
here for the coefficient of Similarly hereafter with A n 

Notice that A n involves pt n but may also involve other /is 

* See Edgeworth, J Boy Statist Soc vol lxi, Kapteyn, Slew Fiequency 
Curves %n Biology and Statistics (Groningen, 1903), or Bowley, F Y. Edgeworth's 
Contributions to Mathematical Statistics , 1928 
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Example II 


(A distribution which is not markedly skew) 


Observations 

Pearson’s 
Type III 

Type A 

Edgeworth 

3 

4 

5 ^ 

4 

20 ’ 

17 

22 

17 

38 

42 

47 

42 

63 

59 

60 

59 

51 

53 

50 

53 

29 

33 

27 

32 

21 

15 

13 

15 

4 

i 5 

4 

6 

0 

14 * 

1 

2 

1 

! 04 


1 


In each case three moments have been used The observa- 
tions and Edgeworth’s graduation are taken from Edgeworth’s 
paper, “The generalised law of error” Type A is the least 
successful 


Example III 

(A distinctly skew distribution) 


Observations 

Peai son’s 
Type I 

Type A 

Type B 

Edgeworth 



- 2 

1 


1 



8 


9 


2 

25 

12 

30 

64 

67 

53 

64 

64 

116 

116 

90 

104 

102 

140 

138 

125 

129 

130 

145 

139 

145 

134 

135 

134 

128 

143 

128 

130 

106 

^110 

123 

116 

111 

82 

89 

93 

93 

92 

72 

69 

65 

73 

73 

49 

51 

44 

53 

53 

37 

35 

31 

36 

36 

25 

24 

23 

25 

20 

13 

15 

16 

14 

10 

10 

9 

10 

10 

4 

5 

5 

5 

5 


2 

2 

t 2 

2 


04 

1 

1 

1 



Pearson’s figures come from his Chances of Death * and 

* Chances of Death , i, 74 (London, 1897) 
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Edgeworth’s from his “Generalised law of error”. Each of 
these graduations was obtained with four moments Clearly 
Pearson’s Type Tis the best and Type B the next best gradua- 
tion We do not think Charlier would use Type A m such a case 
In fitting his Type B there are, however, many difficulties 
owing to the fact that he gives us four approximate methods of 
application, this is an objection which may be surmounted m 
the future, but makes Type B awkward at present The other 
points to be noticed m these graduations are the negative 
frequency in Type A and the 40 cases m Edgew r orth’s gradua- 
tion which have no case corresponding to them m the data 
Edgeworth, however, has remarked that he only aims at the 
mam body of the curve and does not much concern himself 
with the tails, but one cannot help feeling that the mam body 
must be understated if one tail possesses an excess of 40 out of 
1,000 cases and the other tail is m defect by only 20 


Example IV 

(J -shaped curve) 


Obsoivations 

Pearson 

T\pe B 

133 

136 9 

134 9 

55 

48 5 

51 6 

23 

22 6 

22 5 

7 

96 

9 5 

2 

3 4 

29 

2 

8 

6 


The Type B curve is given by Charlier m Researches into 
the Theory of Probability The Type B curve gives a slightly 
better graduation, but the agreement is close m both cases 
The example is not conclusive as to J -shaped curves, but shows 
that Type B can graduate them successfully The particular 
example has only six groups, and with a curve of something 
like the right shape and three constants we are likely to reach 
close agreement Edgeworth’s curve is unsuitable A gradua- 
tion by Type A has been given elsewhere, but though it 
apparently graduates the figures the curve is not J -shaped 
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Example V 

(Series which is nearly symmetiieal) 


t Observations 

1 

Pearson’s 
Type IV 

Type A 

Edgeworth 

10 

6 

4 

3 

13 * 

16 

14 

10 

J 41 

49 

46 

34 

115 

135 

126 

110 

326 

321 

306 

298 

675 

653 

637 

662 

! 1,113 

1,108 

1,108 

1,164 

! 1,528 

1,535 

1,563 

1,603 

1 1,692 

1,712 

1,753 

1,747 

| 1,530 

1,522 

1,548 

1,510 

1,122 

1,074 

1,075 

1,024 

611 

604 

589 

571 

| 255 

274 

256 

263 

| 86 

102 

92 

104 

26 

32 

29 

37 

8 

8 

7 

12 

, 2 

2 

2 

2 

! 1 

1 

1 

1 

1 1 

! 





These graduations give similar results and need no comment 


Example VI 

(Distribution having two maxima) 


Data 

Peai son’s 
Type II 

Type A 

Edgeworth 

10 

3 

26 

4 

78 

96 

74 

34 

193 

191 

156 

135 

286 

261 

262 

270 

303 

304 

354 

363 

I 291 

319 

390 

390 

1 303 

304 

354 

363 

I 286 

261 

262 

270 

! 193 

191 

156 

135 

78 

96 

74 

34 

10 

3 

26 

4 


This is an imaginary example giving a double-humped 
distribution It was formed from Type A by putting A s = 0 
and A t = 09, the series being 

-4, — 19, -53,-76, +103, +783, +1929, +2855, etc. 
Negative frequencies, which are meaningless, were discarded 
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and the data cut down and graduated The interesting feature 
is that Type A from which the data were formed gives a poor 
agreement This ft is due to the negative frequencies and the 
integration for moments from — oo to -boo Negative fre- 
quencies are somewhat objectionable m themselves, they are 
still more objectionable when they influence r eurve fitting to 
the large extent show n m this example 

« Example VII 

We have remarked that the^e is a difficulty in choosing a 
solution to Type B, but its graduating power compared with 
other formulae can be indicated by setting out a few examples 
of the forms taken by e~ m m x !x \ from Tables for Statisticians 

Bor comparison I have added examples of Pearson’s Type 
III, though it must not be supposed that either set is meant to 
give the closest agreement with the other that it would be 
possible to make, they have merely been taken to give an idea 
of the range of application By bringing in terms involving 
we can increase the range of Type B and by using the whole of 
Pearson’s system we cover a wider range than that of his 
Type III 


Type B 

Pearson’s Type III 

I 

11 

HI 

T 

11 

in 

368 

111 

45 

387 

63 

31 

368 

244 

140 

386 

279 

149 

184 

268 

217 

160 

285 

230 

61 

197 

224 

47 

189 

218 

15 

108 

173 

15 

102 

160 

3 

48 

107 

4 

49 

101 

1 

18 

55 

1 

21 

56 


6 

25 


9 

29 


2 

10 


! 3 

14 



3 


l 

6 



1 



3 





i 

I 

1 


9. The few examples we have given will be of help m bringing 
out the comparison of the types of curves with which we have 
been deahng r 
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The Pearson-type curves will graduate satisfactorily all the 
examples we have taken, but cannot reproduce the double 
hump of our imaginary data (Example VI). They will graduate 
symmetrical, shghtly skew and very skew distributions and 
also J and U-shaped distributions They h^ave been fitted in 
various circumstances and are satisfactory from the point of 
view of agreement The arithmetic involved is, however, 
very heavy, but the curves are the most useful of those 
now considered ^ 

Type A gives numerically the least work, but it does not 
graduate satisfactorily very skew or J and U-shaped distribu- 
tions and it has therefore a smaller vogue If, however, it is 
combined with Type B as Charher suggests, J -shaped and 
skew distributions can be graduated We have found some 
difficulty m applying Type B, for Charlier does not give much 
help m deciding which of his four methods of fitting should be 
followed in a particular case, and we feel that the graduation 
capacity of this type may be greater than our trials with it 
j ustify us m thinking at present It would clearly be impossible 
to improve on its graduation m Example IV, but Example III 
and two examples given by Charher m his Researches into the 
Theory of Probability are less fortunate 

Edgeworth’s curve can, roughly speaking, graduate the 
same distributions as Type A 

10 . We may now refer to two difficulties in connection with 
Edgeworth’s curve and with Type A respectively which have 
already been motioned In Example III we found that 40 
out of 1,000 in Edgeworth’s graduation have no observations 
corresponding to them and we remarked that it seemed a large 
excess , the reproduction of the exact number of observations is 
not only a practical necessity, but is assumed by the method of 
moments If, therefore, a large number of cases falls outside 
the observations, we must either say that the total frequency is 
not reproduced or that the frequencies are misplaced, in either 
case the mam body must be artificially reduced below the 
amount shown m the original data In slightly skew ckstribu- 
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tions the frequencies are satisfactorily reproduced and many 
of the graduations of such material are excellent, but the 
method can hardly be considered satisfactory as a general 
formula until some method of overcoming the difficulty 
mentioned above ^las been found 

The difficulty m connection with Type A is the large part 
that negative frequencies play m some of the less symmetrical 
graduations If a negative frequency occurs, have the positive 
frequencies been overstated ? The defence of such negatives is 
that further terms of the series would put things right, but it is 
hard to see the justification for basing much argument on 
constants derived from the higher moments which are liable 
to large variations and are unrehable It is also unsatisfactory 
that a curve cannot reproduce itself even approximately, 
and the result of our Example VI is disappointing, probably 
however it would be well to consider such cases as relating to 
heterogeneous material and therefore more suitable for re- 
presentation by two or more superimposed curves * If Type A 
or Edgeworth’s curve and their moments could be integrated 
from — a to b instead of from — co to oo, the difficulties could be 
overcome to some extent, but, failing that, it would seem 
necessary to limit the range" of applicability to the less ab- 
normal distributions An approximate method of fitting from 
— a to oo has been givenf , but the results are not quite so good 
as Pearson’s Type III 

11. If the reader makes any extensive trial with series for 
the purpose of graduation, he will find occasionally that the 
coefficients of successive terms are such as to imply that the 
series may not be convergent. This is closely connected with 
the difficulty mentioned m the preceding paragraph 

* We are doubtful if it is statistically possible ever to produce a double hump 
with Type A or JEdge worth’s cuivc if the ordmaiy - co to oo integration is per- 
formed, because the relative values of the second and fourth moments required 
by the coefficient m the formula would seem impossible 

f E C Rhodes, J Roy Statist Soc 1925, pp 576 et seq 


<r 
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CHAPTER VII 


CORRELATION 

« 

1. We say that tall men have longer legs than short men, 
that the older a bachelor the less likely he is to marry and have 
children, that a jnan marrying late in life usually takes a wife 
who is older than the wife of a man marrying early, or, to take 
an example from hfe assurance p praetice, that, when endowment 
assurances are grouped according to the unexpired term, the 
mean ages at maturity increase with the unexpired term All 
these statements express m different words the fact that there 
is some causal relationship, or correlation, between the height 
of a man and the length of his legs, between the ages of husband 
and wife or between the age at maturity of endowment 
assurances and the unexpired term The statements are, how- 
ever, m general terms; they do not help us to decide whether 
one relationship is closer than another, they do not supply any 
scale of correlation The object, m statistical work, is to find 
a measure, we have a scale for measuring probability and 
similarly we want a scale for measuring correlation 

This suggests that if there is no correlation our scale ought 
to measure zero and, just as certainty is indicated by a pro- 
bability of unity, so we may call our correlation unity when the 
relationship is a^close as possible There is, however, one point 
where the analogy between probability and correlation breaks 
down, there is no such thing as negative probability, but we 
can easily see that we can have negative correlation, for we may 
have two things, A and B , which increase together like the 
ages of husband and wife, or two things, C and D , one of which 
increases as the other decreases like the age of a bachelor and 
the number of children born from subsequent marriages 

2. With this introduction we may set down a definition of 
correlation m the following words, “two measurable charac- 

( 14* ) 



teristics, A and B, are said to be correlated when, with 
different values x of A, we do not find the same value y of B 
equalty likely to be associated ” In other words, certain values 
of B are more likely to occur with the value x than others If 
they were not, correlation would be absent, or, to take a 
specific case, if men marrying at 20, or at 30, or^at 60, or at any 
other age always married women of 40, there would be no 
correlation On the other hand, the correlation would be 
perfect if every man had to marry a woman exactly n years his 
junior 


Unexpired 
term of 
endowment 
assuiances 
(centie of 
gioup of 

5 teims) 

Central age at maturity 

Total 

Mean 
maturity 
age for 
the row 

30 

35 

40 

45 

50 

55 

60 

65 

70 

75 

2 

2 



2 

26 

6 

14 

6 



56 

53 75 


24 

20 

10 

12 

8 

4 

0 

4 





7 

1 

1 

2 

6 

62 

36 

40 

22 

2 


172 

55 03 


18 

IT) 

12 

0 

G 

3 

0 

3 

6 




12 


2 

9 

17 

117 

99 

127 

52 

8 

1 

432 

55 85 



10 

8 

0 

4 

2 

0 

2 

4 

6 



17 

3 


6 

24 

145 

155 

237 

84 

11 


665 

56 59 


0 

5 

i 

3 

2 

1 

0 

1 

2 

3 



22 


1 


3 

133 

167 

271 

78 

20 

1 

674 

57 58 


0 

0 

0 

0 

0 

0 

0 

0 

0 

0 



27 




9 

90 

123 

231 

71 

11 

3 

538 

57 88 





3 

2 

1 

0 

1 

2 

3 



32 




1 

11 

49 

127 

49 

8 

2 

247 

59 94 





G 

1 

2 

0 

2 

4 

6 



37 






6 

49 

22 



77 

61 04 







3 

0 

3 





42 






2 

2 

3 


1 

8 

62 50 







4 

0 

4 

3 

12 



47 








1 



1 

65 00 








0 

5 


















Total 

6 

4 

17 

62 

584 

643 

1098 

388 

60 

8 

2,870 


Mean un-) 







i 






expired [ 
term for f 

10 3 

13 2 

13 2 

16 1 

17 2 

20 1 

21 9 

21 7 

21 5 

27 6 



column J 














Notes Toi explanation of small numbeis, see § 10 

A column oi row is called an array The middle value of the vanable with 
which the low is associated is called its type, so that the third column (i e that 
headed 40) would be called the y-anay of type 40, and the fourth row would 
be called the a-array of type 17 The word ‘type’ is sometimes omitted 
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3. The statistical aspect of the problem, is exemplified in 
the above table of double entry which gives particulars of 
2,870 endowment assurances grouped according to unexpired 
term 

A httle examination of the table shows that correlation is 
present, for we notice that the figures m the column givmg the 
mean maturity age for each row increase steadily from 53 75 
to 65, while the term increases from 2 to 47 Similarly, the 
mean unexpired terms increase from 10 3 to 27 6 as the age at 
maturity increases from 30 to 75. The two sets of figures are 
indicated m the diagram, p 144 Now let us imagme that there 
was no correlation, then the means of the columns would have 
been independent of the other function, that is, we should have 
found the same mean for each column When plotted on a 
diagram the means would have rim horizontally This suggests 
that, perhaps, correlation might be measured by the slope of a 
straight line drawn through the means, and we may follow up 
this idea by fitting a straight line (y = a 2 + b 2 x) to the correla- 
tion table and seeing what we can gather from the result 



4. When we were fitting curves to frequency distributions 
we used the Method of Moments, and the following proof 
adopts the same principle 


( 143 ) 



Let x x y v x 2 y 2 , etc be associated deviations, and let 
y = a 2 + b 2 x 

be the straight line used m the graduation, then the graduated 
figure corresponding to x 1 is a 2 + b 2 x 1 

Now, if we proceed as we did m fitting frequency-curves by 



Note The mean unexpned terms corresponding to actual central ages at 
maturity are shown x and the mean central ages at maturity correspondmg to 
actual unexpired terms are shown o 

The diagram is arranged so that the standard deviation of the maturity ages 
is represented by the same length as the standard deviation of the unexpired 
terms and, consequently, the angles formed by the two regression lines with their 
respective axes are equal The tangent of this angle m each case is r ( 254) 
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the method of moments, we make the graduated and un- 
graduated areas, means, etc. equal, or 

(& 2 + &2^i) + ( a 2 + b 2 x 2 ) + ... = + *• 

or .Na 2 + & 2 £'(a;) = S'(y) 

And (a 2 + b^x 1 + (a 2 + b 2 x 2 )x 2 +.. m =x 1 y 1 + x 2 y 2 + .. 
or a 2 S'(x) + b 2 S'(x*) = #'(#?/) 

where S'(x), bemg the sum of all the %’s 3 gives the first moment 
of the x’s, S'(y) 4;he first moment of the y% S'(x 2 ) the second 
moment for the afs, and S'(xy) a moment in which any fre- 
quency is multiplied by the product of the distances in the x 
and y directions * 

If these moments are now transferred to the mean, as was 
done m fitting the frequency-curves, we have 

Na 2 = 0 or a 2 = 0 
and b 2 S(x 2 ) = S(xy) or b 2 = 

But we have already seen that the second moment of the 
whole frequency (N) is Ncr x 2 , therefore 


and 


8(xy) 
2 Ntrl 


y = 


8{xy) 

Ncr\ 


If we now write S{xy) *= Ncr^r, we have 


(T 2 N 
y = r—x 

<*i 

<r 1 

x = r~y 


where r will represent the statistical measure of correlation 
(coefficient of correlation) between the x’s and y 3 s and the 
second equation has been evolved similarly to the first. 


* Cp Table II, p 16. The frequency 29 is multiplied by the appropriate 
value ( - 4) It would be the same thing if we took the distance ( - 4) of each 
of the 29 cases and added these twenty-nine ( - 4)’ s "together 
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5. At first sight it may appear that the two equations just 
given, showing the relationship between x and y , are not 
consistent It m^ist, however, be remembered that the first, 

y — r^x, gives the mean values of y corresponding to parti- 
0’1 

x r 

cular values of x (indicated by the insertion of ad>ar over the y ), 
while the second gives the mean values of x corresponding to 
particular values of y To take a simple case as an example, 
assume that cr 1 = <r 2 = 1 and that r = 1, then i£x — 0 the mean 
of the y’s corresponding to this value of x is 0, and if x = 20 the 
mean of the y ’ s will be 2 When we turn the matter round, 
however, we cannot, of course, assert that the mean of the x’s 
corresponding to y = 2 is 20, it will be 2 

6. After this preliminary remark we may return to the two 
equations and consider how it is that r is a measure of correla- 
tion and whether it can always be treated as a satisfactory 
measure. We can best see that ? is a measure of correlation by 

(T V X 

rewriting the equation y = r~^x m the form — = r — or 

, 0i cr 3 cr x 

Y s=s Xt 9 and we can then interpret it as giving one charac- 
teristic m terms of the other where the mean is the origin (this 
is due to referring moments to the mean m the proof) and the 
unit of measurement is the standard deviation m each case 
In this form we see at once that as one characteristic (X) 
increases the mean ( Y) of the corresponding series of the other 
characteristic increases to an extent which depends on the 
value of v ; while if r is negative Y decreases^ It is only if r is 
unity that the increments of X and Y become equal and ab- 
solute correlation is reached If Y remains constant as the 
value of X increases, the definition at the beginning of this 
chapter tells us that there is no correlation, and r m this case 
is zero as can easily be seen from the equation Y — Xr, We 
have anticipated that our scale for measuring correlation 
should run from — 1 to +1, but we may accentuate the fact 
that a large negative value does not mean that the two 
characteristics do not vary together but only that increases in 
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the one correspond with decreases in the other, the numerical 
value of r indicates the extent to which variations in the two 
characteristics correspond This indication is satisfactory pro- 
vided the means, when plotted m a diagram such as that on 
p 144, fall approximately m a straight line (i.e “regression”* 
is hnear) Distinct deviations from linearity are not so common 
as might be supposed, but if they are very marked m any case, 
r ceases to be a satisfactory measure of the correlation 

7. We may fake this opportunity of removing another 
difficulty that is sometimes met Some students have a doubt 
which is best shown by the question “ How can there be perfect 
correlation when one thing is always smaller than another ? ” 
As an example we may take the correlation between the lengths 
of a man’s right arm and his left arm, here the coefficient of 
correlation would be practically umty, and since each cha- 
racteristic is measured from its own mean, and m terms of its 
own standard deviation, the coefficient would not be decreased 
if every left arm was a certain number of inches shorter than 
the right or if it bore a fixed relation in length, say 99/100, to 
the right arm 

8. It is now necessary to discuss the arithmetical calculations 
and if we look back at the formulae at the end of § 4 we see that 
we require two standard deviations and a value for S(xy) We 
have already seen how standard deviations are obtained and it 
will be remembered that when the calculation of moments was 
discussed we found that, though they were required about the 
mean, it was best m practice to take them about some point 
fixed arbitrarily so as to avoid fractions and then adjust the 
results afterwards The values of the cr 1 and cr 2 can, of course, 
be found with the help of the formula on p. 57, viz v 2 = v' 2 — d 2 . 
The deduction of from the second moment should be made 
for the same reason and m the same cases as m frequency- 
curve fitting 

* The term “ regression” was adopted by Francis Galton m connection with 
the study of heredity, it indicates the way the children of particular parents 
tend to “step back” to the ordinary population me^n 
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5* At first sight it may appear that the two equations just 
given, showing the relationship between x and y , are not 
consistent It must, however, be remembered that the first, 

y = r— #, gives the mean values of y corresponding to parti- 
al r 

cular values of x (indicated by the insertion of a*bar over the y) : 
while the second gives the mean values of x corresponding to 
particular values of y To take a simple case as an example, 
assume that cr x — cr 2 = land that r = *1, then if x = 0 the mean 
of the y’ s corresponding to this value of x is 0, and if x — 20 the 
mean of the y' s will be 2 When we turn the matter round, 
however, we cannot, of course, assert that the mean of the x’s 
corresponding to y = 2 is 20, it will be 2 

6. After this preliminary remark we may return to the two 
equations and consider how it is that r is a measure of correla- 
tion and whether it can always be treated as a satisfactory 
measure. We can best see that r is a measure of correlation by 

O' V X 

rewriting the equation y = r~x m the form — = r — or 

, cr 2 CTi 

Y = Xr, and we can then interpret it as giving one charac- 
teristic m terms of the other where the mean is the origin (this 
is due to referring moments to the mean m the proof) and the 
unit of measurement is the standard deviation m each case 
In this form we see at once that as one characteristic (X) 
increases the mean (Y) of the corresponding series of the other 
characteristic increases to an extent which depends on the 
value of r , while if r is negative Y decreases^ It is only if r is 
unity that the increments of X and Y become equal and ab- 
solute correlation is reached If Y remains constant as the 
value of X increases, the definition at the beginning of this 
chapter tells us that there is no correlation, and r m this case 
is zero as can easily be seen from the equation Y = Xr . We 
have anticipated that our scale for measuring correlation 
should run from — 1 to +1, but we may accentuate the fact 
that a large negative value does not mean that the two 
characteristics do not vary together but only that increases in 
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the one correspond with decreases in the other, the numerical 
value of r indicates the extent to which variations m the two 
characteristics correspond This indication is satisfactory pro- 
vided the means, when plotted in a diagram such as that on 
p 144, fall approximately in a straight line (i e ce regression”* 
is hnear) Distinct deviations from linearity are not so common 
as might be supposed, but if they are very marked in any case, 
r ceases to be a satisfactory measure of the correlation. 

7. We may fake this opportumty of removing another 
difficulty that is sometimes met Some students have a doubt 
which is best shown by the question . £ e How can there be perfect 
correlation when one thing is always smaller than another 2 ” 
As an example we may take the correlation between the lengths 
of a man’s right arm and his left arm, here the coefficient of 
correlation would be practically umty, and since each cha- 
racteristic is measured from its own mean, and m terms of its 
own standard deviation, the coefficient would not be decreased 
if every left arm was a certain number of mches shorter than 
the right or if it bore a fixed relation m length, say 99/100, to 
the right arm. 

8. It is now necessary to discuss the arithmetical calculations 
and if we look back at the formulae at the end of § 4 we see that 
we require two standard deviations and a value for S(xy). We 
have already seen how standard deviations are obtained and it 
will be remembered that when the calculation of moments was 
discussed we found that, though they were required about the 
mean, it was best m practice to take them about some point 
fixed arbitrarily so as to avoid fractions and then adjust the 
results afterwards The values of the cr x and cr 2 can, of course, 
be found with the help of the formula on p 57, viz v 2 = v' 2 — d 2 . 
The deduction of from the second moment should be made 
for the same reason and m the same cases as m frequency- 
curve fitting 

* The term “regression” was adopted by Francis Galton m connection with 
the study of heredity, it indicates the way the children of particular parents 
tend to “step back” to the ordinary population me^n. 
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With regard to the product moment we have 
S(x'y') = S{x + d x ) {y + d 2 ) 

f = S(xy) 4- d x S (y) + d 2 S (x) + Nd x d 2 , 

or smce S(x) = S(y) = 0 

8(xy) = S(x'y') — Nd x d 2 

where S(x'y') is calculated about a point distant d x from the 
mean of the & 5 s and d 2 from the mean of the ^’s 

9. The statistical example on p 142 can now be worked 
through. It will be found to make the proofs and methods 
given above much easier to grasp 

A point about which moments are to be calculated is first 
fixed, say the middle of the group corresponding to maturity 
age 60 and unexpired term 22 years, and for the present the 
calculations are made about this point The following table 
shows the calculation of the mean and the second moment of 
the totals of the y-arrays, l e the totals at the bottom of the 
table, because columns are ^/-arrays and rows ^-arrays 


Frequency 

x' 

Fiequency x x' 

Fiequency x (x') 1 

6 

-6 

36 

216 

4 

-5 

20 

100 

17 

-4 

68 

272 

62 

-3 

186 

558 

584 

-2 

1,168 

2,336 

643 

-1 

643 

643 

1,098 

0 

-2,121 


388 

1 

388 

388 

60 

2 

120 

240 

8 

3 

24 

72 

2,870 =N 


+ 532 
- 1,589 

4,825 


r __ 1589 
1 ~ 2870 


- 55366 


Hence, the mean age = 60 — 2 7683 = 57 2317, because the 
umt of grouping is 5 years. 
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cr\ = o q < 7 A — ^1 — T 2 (Sheppard’s adjustment) 

2o70 

= 1 37465 --08S 
= 1 29132 
<j t = 1,13637 

Treating the rows in the same way, the following table was 
formed 


Frequency 

y' 

Frequency x y' 

Frequency x {y'f- 

56 

-4 

224 

896 

172 

-3 

516 

1,548 

432 

-2 

864 

1,728 

665 

-1 

665 

665 

674 

0 

-2,269 


538 

1 

538 

538 

247 

2 

494 

988 

77 

3 

231 

693 

8 

4 

32 

128 

1 

5 

5 

25 

2,870 =N 


+ 1,300 
- 969 

7,209 


*“-2870 -- 33763 

Mean unexpired term = 22 — 1-68815 = 20-31185 

2 7209 /J2 1 

** = 2870 “*“* 

^ = 2-31453 

and cr 2 = 1-52135 

10 . The value of S(xy) is formed with the help of the 
numbers m very small type appealing under the frequencies 
m the correlation table The frequency 62 in the 50 column, 
for instance, is distanced three spaces upwards and two side- 
ways from the arbitrary origin, so the value of x'y' by which it 
has to be multiplied is 3 x 2 = 6, as shown m the small type 
The other figures are obtained in like manner, but the sign 
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must be borne m mmd. Any value from the left-hand upper 
division of the table, or m the lower right-hand division, will be 
positive, becauserthe frequency will be multiplied by a product 
of an x and y having like signs, while any value from the other 
divisions will be negative, because the x and y by which the 
frequencies are multiplied are of opposite sign&. 

The calculation of the product moment is as follows 


Frequencies 

2 ' y' 

Total of 
frequencies 
(/) 

fxx'y' 

155 + 71-84-123 

1 

+ 19 

+ 19 

145 + 99 + 11+49-11 -52-49-90 

2 

102 

204 

24 + 36 + 3 + 22-22-6-9 

3 

48 

144 

6 + 6 + 8 + 3-6-8-11-2 + 117 

4 

113 

452 

1 

5 

1 

5 

3 + 17+62 + 2-1-1-2 

6 

80 

480 

9 + 26 

8 

35 

280 

6 

9 

6 

54 

2 

10 

2 

20 

2 + 2 + 1 

12 

5 

60 

1 

15 

1 

15 

1 

18 

1 

18 

2 

24 

2 

48 




1,799 


S(xy) = S(x'y') — Nd 1 d 2 
- HW-Nd x d 2 
= 1262-51 


S(xy) _ 1262 51 

r ~ Nar x <r 2 ~~ 2870 x 1 13637 x 1-52135 
= -254 ^ 

The coefficient of correlation between age at maturity and 
the unexpired term of endowment assurances is -254. 

The equation representing the one function in terms of the 
other is 

<r x 

x — r—y 
cr 2 

= -190 y 

where all measurements are made from the mean and the umt 
is 5 years The hne cbawn m the figure gives this result. 
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11 . An alternative method similar to the summation 
method given m §9, Chapter III for moments can be con- 
veniently used m connection with correlation tables 

Taking the same example, we obtam from the given table 
another m the same form, giving the y sum of it by summing 
each column continuously, and then form a third table by 
summing the second table across contmuously. 


Table of the y-sum of Correlation Table 


Unexpired 
term of 
endowment 
assurances 

Central age at maturity 

30 

35 

40 

45 

50 

55 

60 

65 

70 

75 

Totals 

2 

6 

4 

17 

62 

584 

643 

1,098 

388 

60 

8 

2,870 

7 

4 

4 

17 

60 

558 

637 

1,084 

382 

60 

8 

2,814 

12 

3 

3 

15 

54 

496 

601 

1,044 

360 

58 

8 

2,642 

17 

3 

1 

6 

37 

379 

502 

917 

308 

50 

7 

2,210 

22 

0 

1 

0 

13 

234 

347 

680 

224 

39 

7 

1,545 

27 

0 

0 

0 

10 

101 

180 

409 

146 

19 

6 

871 

32 

0 

0 

0 

1 

11 

57 

178 

75 

8 

3 

333 

37 

0 

0 

0 

0 

0 

8 

51 

26 

0 

1 

86 

42 

0 

0 

0 

0 

0 

2 

2 

4 

0 

1 

9 

47 

0 

0 

0 

0 

0 

0 

0 

1 

0 

0 

1 

Totals 

16 

13 

55 

237 

2,363 

2,977 

5,463 

1,914 

294 

49 

13,381 


Table of x-sum of above Table , i e Table giving all 
cases for xy group and over in Correlation Table 


Unexpired 
term of 
endowment 
assurances 

Central age at maturity 

30 

35* 

40 

45 

50 

55 

60 

65 

70 

75 

Totals 

2 

2,870 

2,864 

2,860 

2,843 

2,781 

2,197 

1,554 

456 

68 

8 

18,501 

7 

2,814 

2,810 

2,806 

2,789 

2,729 

2,171 

1,534 

450 

68 

8 

18,179 

12 

2,642 

2,639 

2,636 

2,621 

2,567 

2,071 

1,470 

426 

66 

8 

17,146 

17 

2,210 

2,207 

2,206 

2,200 

2,163 

1,784 

1,282 

365 

57 

7 

14,481 

22 

1,545 

1,545 

1,544 

1,544 

1,531 

1,297 

950 

270 

46 

7 

10,279 

27 

871 

871 

871 

871 

861 

760 

580 

171 

25 

6 

5,887 

32 

333 

333 

333 

333 

332 

321 

264 

86 

11 

3 

2,349 

37 

86 

86 

86 

86 

86 

86 

78 

27 

1 

1 

623 

42 

9 

9 

9 

9 

9 

9 

7 

5 

1 

1 

68 

47 

1 

1 

1 

1 

1 

1 

1 

1 

1 

0 

0 

8 

Totals 

13,381 

13,365 

13,352 

13,297 

13,060 

10,697 

7,720 

2,257 

343 

49 

87,521 
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The totals m the right-hand column of the upper table give 
the first sum of the total m the right-hand column of the 
correlation table, and are the same as the column x — 30 m the 
lower table The total of the y sum, or of the first column m the 
xy table, gives the mean of the y’ s (13,381/2,870), and simi- 
larly the sum of tlie first row gives the mean<of the x’s 

(18,501/2,870) 

The total of the last table gives the xy moment (87,521), and 
the x standard deviation is found by forming from the first row 
the series 18501, 15631, 12767, 9907, 7064, 4283, 2086, 532, 
76, 8, and summing it, l e 70,855 The second moment about 
the mean can then be found, the numerical work being as 
follows 

18501 

a mean = 585 6 ‘ 4463 


v 2 = 2S Z — d(l + d) 
2 x 70855 


2870 
= 1-3747 

Similarly with the y moments 

13381 


- 6 4463 x 7 4463 


y mean 


2870 


= 4-6624 


2 (13381 + 10511 + 7697 + 5055 + 2845 

+ 1300 + 429 + 96+10 + 1) _- 4 6e24 x 5 6624 


2870 


- 2-2312 


87521 

The xy moment = -7—— — 6 4463 x 4-6624 
* 2870 


= 4399 


Remembering that v 2 —^ (Sheppard’s adjustment) = <r 2 
and that the means are, m the above work, measured from the 
centre of the group $ = 25, y = — 3 years, the values just 
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given will be found to agree with those previously obtained by 
the direct method. The xy moment (*4399) is the same as 
1262*5 W7VT 

-2870 * 1 e ' S ( x y)l N 

12. We have already remarked (§ 6) that the method we have 
used for measuring correlation assumes that the means of the 
rows and of the columns, respectively, he on straight lines and 
consequently we must examine a table to see whether this 
holds One advantage of the method given in the previous 
paragraph is that it enables us to get the means of each 
column and of each row very easily. Remembering* 

( 1 ) that the interval between the groups of unexpired terms 
is 5 years 

(2) that the interval between central ages at maturity is 
5 years 

(3) that the arbitrary origin is the point representmg central 
age at maturity = 25 and unexpired term = — 3 

we can get the means of the columns by taking each total m the 
2 /-sum table, multiplying by 5, dividing by the number of cases 
and subtracting 3, thus, for the column with central age 50 


2363x5 

584 


-3 = 


17*23 


The means of the rows come from the differences of the totals 
on the right of the next table and thus for unexpired term 2 we 
have ^ 

(18501- 18179) x 5/56 + 25 = 53 75 

and for unexpired term 17 

(14481 - 10279) x 5/665 + 25 = 56 59 

13 . There is yet a third way of domg the arithmetical work to 
reach the coefficient of correlation and, as it is short and relies 
on one of the senes of means, it has a good deal to commend it 
The calculation is as follows 
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First moment of 
column (for total 
frequency m column) 
about arbitrary 
origin of columns 
(i e unexpired 
term 22) 

(i) f 

Distance of 
column from the 
arbitrary origin 
of rows 
(i e age 60) 

(2) 

(1)*(2) 

,< 3 > 

- 14 

-6 

+ 84 

- 7 

-5 

+ 35 

- 30 

-4 

+ 120 

- 73 

-3 

+ 219 

-557 

-2 

'+1,114 

-238 

-1 

+ 238 


0 


- 26 

+ 1 

- 26 

- 6 

+ 2 

- 12 

+ 9 

+ 3 

+ 27 


i 

1,799 


r = 


/Total of (3) 

l x 



= •254 as m §10. 


The unit throughout is 5 years and the easiest way to do the 
calculation for col (1) is as shown m the table on p 155 

There is no need to insert a column for age 60, or a row for 
term 22, as these are multiplied by zero, they are sometimes 
worked out for completeness and because they make it easier 
to apply arithmetical checks which the reader can evolve for 
himself 

If the reader considers any item m this scheme, eg 18 m 
column headed 40, he will see that it represents 9 cases m the 
table (p 142) multiplied by —2, and when it is, amongst 
other numbers, taken to col (1) of the table above, it will be 
multiplied by -4, that is, we shall have multiplied 9 by 
( — 2) x ( — 4), i e by 8, which is the little figure written under 9 
m the table on p 142 

Before dealing with other examples and methods, it may be 
well to point out a use to which the particular example might 
be put The result m the equation form gives the average age 
corresponding to eaqh unexpired term Now, we might weight 
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(Frequency m column) x (distance from arbitrary origin) * 


Maturity 

age 

30 

35 

40 

: 

45 

50 

55 „ 

60 

70 

75 

Distance 










from arbi- 




“Minus” products 0 




trary origm 

* 









-4 

8 



8 

104 

24 

24 



-3 

3 

3 

6 

18 

186 

108 

66 

6 


- 2 


4 

18 

34 

234 

198 

104 

16 

2 

-1 

3 


6 

24 

145 

155 

84 

11 


Total minus 

14 

7 

30 

84 

669 

485 

278 

33 

2 


“Plus” products 

+ 1 




9 

90 

123 

71 

11 

3 

+ 2 




2 

22 

98 

98 

16 

4 

+ 3 






18 

66 



+ 4 






8 

12 


4 

+ 5 







5 



Total plus 




11 

112 

247 

252 

27 

11 

Pigs for 
col (1) 

-14 

-7 

-30 

-73 

-557 

-238 

-26 

-6 

+ 9 


each entry with Lidstone’s Z’s* or with the temporary an- 
nuities, then work out an equation in each case, and get new r 
series of average ages. The results used in a valuation would give 
the relative accuracy of the three methods I have worked out 
the formula with the Z weights (H M Table), and found that 

Age at maturity = 57*595 + T200 x (unexpired term) 

The results could>also be used as a rough check on the average 
ages at valuations, and there certainly seems a possibility of 
doing something towards making a simple “ model office” for 
endowment assurances with the help of the method we have 
been using 

* The method used by me was appioxunate and can probably be improved, 
the result is merely given as an indication of a possible line for research 
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CHAPTER VIII 


THEORETICAL DISTRIBUTIONS 
SPURIOUS CORRELATION 

'1. In the previous chapter we saw that rt was natural to 
want a scale for measuring correlation and we showed that if 
we simplify the full table by fitting a straight line* to the 
statistics, then its slope might be taken as a measure of corre- 
lation But, though this seems reasonable on the evidence, it is 
not conclusive, it might be better to use some function of the 
slope of the line rather than the slope itself, or, we might find 
from experience that a straight line wa$ not the best thing to 
use m our simplification We have, therefore, to see if these 
doubts can be removed, and a way to do this is to consider 
correlation from a theoretical standpoint by building up tables 
m which we can estimate the amount of correlation from 
general considerations 

2. Various correlation tables can be devised, but we may 
begin by taking a case where ten coins are tossed and eight 
of them are left on the table, the other two being re-tossed 
Then we have a pair of tossings m which eight coins out of ten 
are common to each member of the pair We repeat the experi- 
ment a number of times and produce a correlation table m 
which, as eight out of ten coins are fixed, we may expect the 
correlation to be measured by 8 or at any rate by a function of 
8 Similarly, if we leave 5 coins the coefficient should be *5 
and if we leave 2 coins it should be 2 The tables worked out 
theoretically would be as shown on pp 165-7 These tables 
are symmetrical, the two standard deviations are the same and 

* We reach two straight lines for each correlation table, one corresponding 
to the means of the columns and the other corresponding to the means of the 
rows r 
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the means (see last column and bottom row) run in a straight 
line and the slope of the hue, judged by the tangent of the angle 
it makes with the horizontal, is 8 or *5 or *2, 

3. We may indicate how the tables were formed by taking 
the one that gives the numbers when eigjit corns are left 
Consider those «ases in which there were ten “heads” at the 
first throw Then all the eight corns left must be “heads The 
other two on bemg re-tossed will be m the following pro- 
portions. 

2 Heads 1 case 

1 Head and 1 Tail ... 2 cases 

2 Tails . ... .. lease 

Thus, with the eight heads left, we conclude that for four cases 
produemg 10 heads at the first throw, one will produce 
10 heads at the second throw, two will produce 9 heads, and 
one will produce 8 heads Now consider the next case where 
there are mne “heads ” and one “tail ” at the first throw. Then 
we can leave either eight heads or seven heads and one tail; 
the number of ways in which we can do this is 9 Cg and 9 C 7 , that 
is 9 and 36 or, as we are only concerned with proportions, as 
1 4 The two re-tossed coins will be thrown m the proportion 
of lHH:2HT ITT, and we can then produce the second 
column. The reader will appreciate that the totals of the 
columns will be a multiple of the terms of the binomial ( J + 1) 10 . 

The coefficients worked out by the methods of Chapter VII 
give the values 8, 5 and -2 For example S(xy) for the *8 table 
will be found to he 8192 and cr 1 = cr 2 = *J2 5. Therefore 

S(xy) _ 8192 

N(X x <t 2 4096x2*5 

4. Let us see if we can use these tables to help us to decide 
whether 8 or a function of *8 should be used as the measure or 
coefficient of correlation An easy experiment is to add the 8 
table and the *2 table together after mcreasmg the former so 
that the two tables represent the distribution of the same total 
number of cases The result of such a process is that the means 
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of the rows (or columns) will be half-way between those shown 
m the two tables from which the composite table is formed. 
These means arev identical with those of the *5 table, although 
the distribution of the cases is different. This is evidence that 
we can assume that *8 or *5 or 2 is a proper coefficient of 
correlation and that we need not speculate w^th functions of 
these figures If we generahse our result we may say that we 
want to find a function of r such that 

/W+/(r 2 )=/(^ 2 ) 

and this is satisfied by writing f(r) — r 

5 . It will, however, be noticed that we have chosen a 
particular case where the distribution is based on a sym- 
metrical binomial and it does not follow that other cases will 
be so easy to interpret We can, however, form similar tables 
with dice where we regard two of the six faces as u head’ 5 and 
four other faces as “tail” We then get distributions of the 
form (-g-H-f) 10 , the means of the columns (or rows) are m a 
straight line and we reach the means of the 5 table by adding 
together equally large tables giving correlation of 8 and 2. 
The r = *8 table is given on p. 168. Admittedly we have even 
now only dealt with tables of double entry corresponding to 
frequency distributions like the binomial series and we cannot 
expect all the distributions that occur m practice to be so 
simple We must not assume that m every case the means will 
follow a straight hne nor are we entitled to^say that the slope 
of the straight hne will give a correct measure of correlation 
if the distribution diverges considerably from those discussed, 
but the large majority of correlation tables conform approxi- 
mately to the type we have indicated 

6 The reader will have noticed that m the work we have 
just been doing we have dealt with a series of points analogous 
to the binomial series and not with a surface analogous to a 
frequency-curve The normal curve with which we dealt in a 
previous chapter i^, m certain conditions, the limit of the 
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binomial series and the frequency surface* corresponding with 
the normal curve is 


Z = 


N 


2 ncr 1 (T 2 j(l-i 2 ) 


2xyr 


2 loy^l-T* 2 ) cr 1 cr 2 (l - r^)~ r oy*( 1 — r 2 ) 


y 


Now, if 7 = 0 this expression reduces to the product of two 

/* oo f* oo 

normal curves and if we find Zxydxdy, we reach 

J — CO J ~oo 

(ot) moment _ 

Nrcr-gor^ or r = — - -y— as we have already supposed m 

Chapter VII. 

7. The normal surface has some properties to which special 
attention may be called. If we examme the distribution of an 
array of type t, we see that it is 

Z = Z 0 e~ {9 i^-Zhix+g^} 


vhere g lt h and g 2 are written for the longer expressions m cr v 
cr 2 and r 

Making the index a perfect square, we have 

J ht) 2 „ , hH* 

Z = Z n e~ ai r~FJ e~ a * i + T: 


= Z’e- g 1 1 35 ' 

which is a normal distribution having the same standard 
deviation as that of the whole surface, but its mean differs 
from that of the whole surface by ht!g 1 It follows that 

(1) the deviation of the mean of the array is directly 
proportional to the type, or, m other words, the means 
of arrays increase or decrease m arithmetical progression 
and so he on a straight Ime, 

(2) the standard deviations of all parallel arrays are equal 
and independent of their types 


So far as the former of these conclusions is concerned, we 
have the same property as that found m our com-tossmg tables 
and assumed m the previous chapter The other property is 
not found with our com-tossmg tables. It must not be con- 


* See Appendix III o 
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eluded from this that the normal surface has so small a scope 
as to be of little practical use, it has, probably, a far larger 
scope than the analogous normal curve has m frequency-curve 
work It may help the reader to visualise the surface if he bears 
m mmd the folloyung points * 

<£> 

(1) vertical sections cut parallel to the axis of x or the axis 
of y are normal curves, 

(2) the contour hnes are ellipses and if these ellipses are 
projected on to the plane Z — 0 they are concentric, 
similar and similarly situated 

The appearance would be that of an isolated hill standing on a 
wide plain This plain rises very slowly as we approach the hill, 
then the hillside becomes gradually steeper until, as we near 
the top, it becomes less steep and the top is nearly, but not 
quite, flat The hill is narrowest when seen from the north- 
west or south-east and widest when seen from the north-east 
or south-west 

8. We may now discuss a danger against which we must be 
on guard m statistical work on correlation The danger is that 
correlation may be revealed when it is absent, or exaggerated 
when present, m consequence of the arrangement of the 
statistical material We will consider two causes of the intro- 
duction of this “ spurious correlation ” The first may be taken 
from our com-tossmg tables We saw that by adding together 
the 8 and *2 tables m equal proportions we reached a table 
which gave a correlation of *5. But let uts see what would 
happen if we added together two tables where r = 5 but 
shifted the mean of one of them This might happen m practical 
work if two persons, recording similar objects, measured 
correctly except that one always overstated his results by a 
constant figure The results are then amalgamated and the 
table formed might then be similar to the table on p 169 The 
coefficient of correlation is worked out and found to be *78 

9 . We may now consider how we might detect the cases m 
which this sort of thing happens The means of the various 
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rows run in a particular way, they begin and end as the r = *5 
tables but, when the amalgamation comes in, the run of the 
line is such as to join the two end pieces together with a curved 
line. Agam, the totals do not form a binomial or any single 
frequency-curve. In the particular case these two pomts would 
be sufficient warning, but in practice it is hard to apply them 
because the ends of an experience, being based on relatively 
small numbers, obscure the real shape of the regression lines 
and the curve formed by the totals. There may also be many 
observers mstead of only two, and these observers might turn 
the end pieces into curved hues and give a regression line like 
a flattened S The real remedy in such cases is to see that the 
various experiences grouped together are alike as regards both 
their means and their distributions and to use amalgamated 
figures only when the amalgamation is justified 
* 10 . Another way in which a spurious correlation may be 
introduced arises through the use of indices. As an example we 
may refer to endowment assurances by limited payments on 
the books of a company doing a large quantity of such business 
and consider the term of the original assurance (tf), the number 
of premiums to be paid m future (t 2 ), and the number of years 
for which the policy has been m force (t 3 ) If we formed the 
ratios t 2 /t 1 and t 3 ft v and worked out the coefficients of correla- 
tion, we should not obtam a measure of the correlation between 
number of premiums payable in future and the number of 
years m force because the result of using fractions with the 
same denominate^ m each would be to exaggerate correlation 
— that is, to introduce spurious correlation. 

The general propositions of spurious correlation, of which 
the result just mentioned is a particular case, are as follows* 

I To find the mean of an index m terms of the means , standard 
deviations and coefficient of correlation of the two absolute 
measurements . 

Let x l9 x 2 , x 3 , x 4 be the absolute sizes of any four correlated 
subjects, m v m 2 , m 3 , m 4 their mean values, cr l9 <r 2 , <r 3 , c r 4 their 
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standard deviations, r 12 , r 23 , ? 34 , r 41 , r 24 , r 13 the six coefficients 
of correlation, e l3 e 2: e 3 , e 4 the deviations of the four subjects 
from their means, 1 e. x x = m x + e l3 etc , ^ 13 the mean value of 
the index x x \x Z3 and i 24 the mean value of x 2 /x 4 , E x and Z 2 the 
standard deviations of the indices x x /x 3 and x 2 \x 4 respectively, 
and N the total number of groups r 

We shall suppose the ratios of the deviations from the mean 
values of the organs are so small that their cubes may be 
neglected Then 



1 m x 
N Wo 


si 



1 SjeJ S(e a ) 

N'm 3 \ m x m 3 m 1 m 3 m\ j 


But S(e x ) = S(e 3 ) = 0 and S(e x e 3 ) = N<r x cr 3 r x3 and $(e 3 ) 2 = iVcr§ 


and 


Hh( i + fl_^i 

m 3 \ m\ m x 
m 4 \ m\ m 2 




II. To find the standard deviation of an index in terms of the 
standard deviations and coefficient of correlation of the two 
absolute measurements. 


N x Z?o = S 


or 




= ^ 3 's!- 


c 3 


(m 1 m. 


+ square termsj 


= ^! 3 (^ 4 +^- 2 -^— — 

\ m\ m\ m x m 3 13 J 


^13 = 


13 


e1 + e1 


i , 


m x m 3 
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Ill To find the coefficient of co? relation of two indices m terms 
of the coefficients of correlation of four absolute measurements and 
their standard deviations . * 

Let xjx z and # 2 /# 4 be the two indices. 

Then, lfp be the coefficient of correlation^ the two indices, 



Proposition I shows that the mean of an index is not the 
ratio of the means of the corresponding absolute measurements, 
and Proposition III shows that the p will vanish when the four 
subjects formmg^the indices are quite uncorrelated, while, if 
two, say, the third and fourth, are identical, so that r 34 = 1 and 
cr 3 /m 3 = <r 4 /m 4 , we have 



This would become applicable m the case of endowment 
assurances by limited payments to whicjh we referred 
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An interesting special case arises when the subjects x v x 2 , x 3 
are not correlated and xjx z and x 2 jx z are formed, then 



11. The practical lessons about spurious correlation to be 
learnt from the foregoing are (1) to deal with homogeneous 
data and not to be too certain about the value of a coefficient 
in the case of amalgamated experiences until you are sure that 
those experiences are homogeneous, (2) to avoid making, or be 
careful m interpreting, correlation tables where the functions 
correlated are expressed as indices m which the denominators 
are identical or may themselves be correlated 

We may add that spurious correlation may arise when the- 
correlated pairs relate to successive years, and so are not taken 
at random as regards time If, however, the correlation be- 
tween the two nth differences becomes equal to the correlation 
between the two (n+ l)th differences, we reach the correlation 
independent of time, provided the dependence of each variable 
on time takes the form a + bt -f bt 2 + 
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Coin-tossings with ten coins in pairs Eight coins common to each member of pair 
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Two coins common to each member of pair 
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CHAPTER IX 


CORRELATION OF CHARACTERS NOT 
QUANTITATIVELY MEASURABLE 

r l . Before the theory m this section is discussed we will give 
a table showing the class of problem with which it deals, 
drawn from vaccination statistics and relating to the Sheffield 
smallpox outbreak of 1887-8** 


Degree of effective 
vaccination 

Strength to resist Smallpox when incurred 

Cicatrix 

Recovenes 

Deaths 

Total 

Present 

Absent 

3,951 

278 

200 

274 

4,151 

552 

Total 

4,229 

474 

4,703 


The characters with which we are concerned are “Strength 
to resist smallpox when incurred ” and “Degree of effective 
vaccination ”, and the statistics cannot be arranged m a more 
detailed manner The characters cannot be measured quanti- 
tatively, but as the absence of such measurement does not 
mean that there is no correlation, w T emust see how the coefficient 
can be obtained m such a case 
2 . Let us consider this problem m the firs u place by seeing if 
we can write down a few cases m which we can assign a value 
to the coefficient of correlation from general considerations. If 
we toss a com, it must come down 4 c head ? ’ or “ tail J 5 , if we form 
pairs as m Chapter VIII, by pairing consecutive tossings, there 
will be no correlation and m a table such as that of the previous 

* BiometnJca, I, 375 et seq This paper, by W R Macdonell, and a supple- 
mentary one deal with the subject m a way that shows clearly the strength of 
the evidence on the side of vaccmation The question of class is investigated, 
a practical pomt frequently neglected 
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paragraph there would be an equal number in each division 
But if we made a parr by leaving the com on the table, and 
counting it a second time, we should have absolute correlation 
and we should have in our table an equal number in the top 
left-hand and bottom right-hand divisions, the other two 
divisions being* blank If we amalgamate these two tables, 
assuming that the total of each is 4, we reach the table shown 
below, having a coefficient of correlation of -5 


Second 

First tossing 

Total 

tossing 

Head 

Tail 

Head 

3 

1 

4 

Tail 

1 

3 

4 

Total 

4 

4 

8 


# In these simple cases, where the four divisions represent the 
frequencies at four separate points, the correct value of r is 
given by the expression 

ad — be 

^ J{(a + b)(c + d)(a + c)(b + d)} 

where a, b , c and d have the meanings indicated m the scheme 
of §5. 

3. It does not, however, follow that this expression is one 
which may be used m all circumstances as, though there are 
only four divisions, the things measured may imply a conti- 
nuous scale of measurement even though we cannot or do not 
express it m detailed fashion Thus, in our vaccmation statistics, 
the degree of successful vaccmation may vary between a 
vaccmation m infancy, for a person aged 40 at the time of the 
epidemic, and a series of vaccmations, the last of which has 
been recently performed Agam, the power of recovery when 
attacked may also be deemed to lie on a longer scale than that 
imphed by the two divisions <£ recoveries 55 and “ deaths 55 To 
take another example we could, if we were studying eye- 
colour m parent and offspring, make a scale of colour from 
black down to pale blue (or to absence of pigment m albinotic 
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cases), but the statistics might be available merely in the form 
“brown” and “not brown”. In other words the statistics in 
a four-fold correlation table may relate to a continuous 
frequency distribution like the table on p 142, but owing to 
the way the facts ^iad to be stated or collected there are only 
four divisions for the whole of the material Oar first problem 
is to see whether the simple formula at the end of § 2 will give 
a satisfactory answer in these circumstances and if it fails what 
alternative may be adopted. * 

4. In Chapter VIII we gave tables based on com-tossing and 
we might group the material of one of these tables into four 
divisions and see what answer the formula m question gives 
If we take the table having a correlation of *5 and cut it between 
the 5 “heads” and 6 “ heads”, we reach the following: 


Number of 
heads m 
second 
tossing 

Number of heads in 

FIRST TOSSING 

Total 

0-5 

6-10 


0-5 

6-10 

15,330 

5,086 

5,086 

7,266 

20,416 

12,352 

Total 

20,416 

12,352 

32,768 


The formula gives 

15330 x 7266 - 5086 x 5086 
_ <4. 

20416 x 12352 

which is far removed from the true value of *5 

5. It is clear from this evidence that we must look for 
another solution and having seen in the previous chapter that 
we could express a frequency surface as 

AT 1 1 / a 8 y* 2rxy\ 

Z = p 2 l—r 2 \ov~<7 s s ovra/ 

V(l —r2 ) 0 ’l° r 2 

we may now consider what conclusions we may draw if we 
divide this surface into four parts by two planes at right angles 
to the axes of x and y at distances hi and k' from the origin, as 
suggested by the figures on p. 173 
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Then 


d = 


N 


r co r co 

r l Cr 3 J ft' J l' 


27T^I(l*-t 2 )cr- 

. ^ r r %-** 

2ttV( 1-‘> 2 )J ft Jft 


1 1 /x 3 j/ a 

6 21-r 3 W ^ cr x a-J dxdy 


1 3T a (£ 2 +2/ 2 -2ra:2/) 


dxdy 


sg2 qj2 

by substituting x 2 for -g and ?/ 2 for ~ 


and writing 
Further 


h = — and Jc — — 
<T n cr, 


6 + d = 


AT 


2 

1 tf 2 


V( 27 rKJ 

iV' 


2cri2 dx 


and 


c + d = 


V(2tt) 

iV 

VW 


r 

J A 

r 

J ^ 


h' 

er^dx 


p-W 


dy 


and, remembering that iV the total frequency = a + 6 + c + <2, 
we have 

(J 

/2"f 7i 


(ct + c) — (6 + <^) 


2 p 

rj a 


N 


’2 r 


er^dx 
er^dx 


and, similarly, 


(& + 6) — (c + <i) 

¥ 


e~ hy2 dy 


As a, 6, c, and d are known, h and & can be found from 
Sheppard’s Tables, and the problem becomes 


To find a value for r from the equation 

__i i 

e 


N 


O oo 

k 


277 <J(l-r 2 ) 
where d, N, h, and k .are known ” 
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The solution (see Appendix IV) leads to the following 
equation 

/if/7 __ Ag a*3 a 4 1 

= , +-hk + -(h*- !)(**- l) + -h(h*-3)k(k*-Z) 


+ (/i 4 - 6A 2 + 3) (F - 6I 2 + 3) 


+ — A(A 4 -10A 2 +15) &(& 4 - 10P + 15) 




+ 5040 (A6_15 ^ + 45 ^ 2 ~ 15) 


x (& 6 — I5k 4 + 45k 2 — 15) + etc. 


where 


H = —I— e~ ih * and K = 


The numerical solution has to be obtained by approximating 
4>o the roots, and Newton’s method* is convenient for the 
purpose 

6. The numerical work of our first example is as follows. 


2 fVw. ..(« + <>) — (* + i) 3755 
nj o ~ N " 4703 

= -7984265 
h = 1 27716 


by interpolation m Sheppard’s Tables (see Tables for Statisti- 
cians) In using these tables for this purpose, remember that 
the value *7984265 corresponds to a in his notation, so 

\ (1 + -7984265) = -8992132 

must be looked up mversely in his Table n. If his Table m be 
used, it must be entered with *7984265. 

* Newton's method of approximating to the root of an equation Let f{x)—0 
be an equation from which the value of x is to be found and let b be a value 
near to x so that x=b+h where h is small, then f(z)=f{b±h)—f(b) + hf'(b) + 
terms involving higher powers of h by Taylor’s Theorem, and since f(x) =0, we 

have h— or x = b-^~~- s The chief objection to the method is that there 

/ (t>) J (o) 

may be more than one root near the value b, but this does not hold m the applica- 
tion to correlation (Cf Approximations to rate of interest from an annuity, 
Todhunter’s Interest and Annuities Certain, p 177, formula 2 ) 
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Similarly 


•7652561 


• 7 ; /+■*“ 

& = 1-18833 

We next require and we first get fpom Sheppard’s 

Tables 

# = 1764870 /. log# = 1-2467127 

K = 1969111 /. log K = 1-2942702 

Hence log = ' 1258266 

and = 1-336062 

Dr Macdonell gives 56 instead of 62 as the last two figures, 5 " 
the difference is probably due to interpolation 

Turning to the expression for r, we notice that lik is a product 
m the coefficients of r 2 , r 4 , r 6 , etc , so it is well to work out its 
value and keep a note of it while the coefficients are being 
found. It is also advisable to begin the work by writing down 
the first six or seven powers of h and k 
Macdonell gives the following series. 

•097083r 7 + *008170r 6 + '119614r 5 + -137450r 4 

+ *043352? 3 + *758844r 2 + r = 1-336056 


In order to obtain r we must find a value^near the true one 
as a first approximation. 


Taking 
we have 


•758844r 2 + r — 1 336056 = 0 


-1+ V{l + 4xl 336 x 7588} 
1-5177 


= -8 


Now, this value will be m excess of the truth because we have 
used only two terms, of the senes on the left-hand side of the 
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equation for finding r, and we may take -77 as a trial rate. 
Applying Newton’s Rule, we have: 

-1 336056 + (-77) + -7588(-77) 2 + 043^(-77) 3 

+ -1375( 77) 4 + 1196( 77) 5 + -0082( 77)«+-0971(-77) 7 
7 l + 2( 77) (-7588) + 3(-77) 2 (-0434) + 4{-77) 3 (-1375) 

+»5( 77) 4 ( 1196) + 6(-77) 5 (-0082) + 7( 77)«(-0971) 


= 77- 


•0022 

2-861 


= -7692 


In work such as this a table giving the first seven powers of 
the natural numbers is a help 

7. Tables of various functions required for the arithmetical 
work will be found m Tables for Statisticians The term 
“tetrachoric functions 55 * is employed there. These tables are 

^arranged so that we can use the equation 

d/N = r^ + r-j^r + r^r 2 , etc. 

where the values of r are tabulated up to r 19 , and further values 
can be obtained by a difference formula given in the intro- 
duction to the tables. 

The calculation of the coefficient has been set out above m 
detail, but with the help of Tables vm and ix of Tables for 
Statisticians , Part n, much of the work can be avoided All 
that has to be done, if these tables are available, is (1) to 
calculate h and k as shown m § 6, (2) to calculate the ratio that 
the number in quadrant d bears to the total number of cases, 
l e. d/N, and (3) tD interpolate in the tables so as to obtain r . 

8. We may now return to our com-tossing, and we find that 
if we work out the coefficient of correlation for the table m § 4 


* The tetrachoric functions are closely allied to the Hermite polynomials and 
provide the fullest tables available. The sth tetrachoric function is (cp p 130) 

(- I )*" 1 - 

(2*r) e 

and the ( s - 


T s i h ) = - 


f dh s ' 

- l)th Hermite polynomial is 

(-I ) 5 - 1 d s - 


»(*) = - 


'{VP *) 6 ***) 
!=l(7(2j) e " }A2 ) 


vW 


EFC 
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by the method just discussed, we reach a value of between *51 
and *52 This is a good result, especially when we remember that 
the com-tossingns not an absolutely continuous scale 

The broad conclusion that may be reached is that the assump- 
tions lead to reasonable results m the kind of cases we have 
tested The method does not work so well when the frequency 
surface is cut far from the mean and the numerical results m 
such cases should not be assumed to have minute accuracy 
r9. We have assumed that the data available are only divided 
into four divisions and we shall postpone till later (Chapter 
XII) the discussion of correlation when the characteristics are 
not quantitatively measurable but are divided into several 
categories We may, however, now deal with the case m which 
one variate is and the other is not quantitatively measurable 
as, for instance, m the table on p 179 relating to the effect of 
enlarged glands on the weight of children (boys) * Though ther 
statistics are divided into good and bad glands, the condition 
of glands is a contmuous variate some of the boys with bad 
glands were worse than others 

If the reader considers a volume of frequency built out of a 
complete table such as that for endowment assurances, or out 
of a correlation table giving relative ages of husbands and 
wives, he will see that he has a complete distribution Now, if 
a volume of frequency be cut off from such a complete volume 
by a vertical plane at a given value of one variate, then the 
vertical through the centroid of this volume cuts the regression 
line The vertical plane m the two-row table is at the division 
of the rows, m our example where the good glands end and the 
bad glands begin. If p and q be the co-ordinates of the point of 
section where the vertical through the centroid of the volume 
cuts the regression hne, then we have, ^ and cr 2 being the 
standard deviations of the two variates and r the correlation, 

0i- • p I q 

p = r—q or r = — / 

0*2 o'J cr 2 

* The method is given by K Pearson in Biometnha y vn, 96 et seq , and the 
example is taken from tha-i paper 
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Weight 

Boys with 
good glands 

Boys with 
bad glands 

Total 

14 

2 


* 2 

16 

3 

5 

8 

IS 

15 

26 

41 

20 

20 

40 * 

60 

22 

28 

47 

75 j 

24 m 

34 

30 

64 

26 

30 

31 

61 

28 

29 

20 

49 

30 

30 

30 

60 

32 

21 

14 

35 

34 

18 

11 

29 

36 

18 

5 

23 

38 

6 

7 

13 

40 

5 

2 

7 

42 

7 

3 

10 

44 

1 


1 

46 




48 

3 


3 

50 

1 


1 

52 

1 


1 

62 

2 


2 

Total 

274 

271 

545 


Now p is the mean value of the quantitatively measurable 
variate for all the pairs with a certain one of the alternative 
variates, in our example, the mean weight of boys with bad 
glands and cr 1 is the standard deviation of all the boys We 
cannot calculate q and <r 2 in a similar way, because they relate 
to glands of which no quantitative measure is available If we 
assume the non-measurable variate (glands) to follow the 
normal probability distribution, the proportion of the non- 
measurable variate gives, with the help of tables of the 
probability integral, the ratio of y/cr 2 for the distance from the 
mean at which the division of this variate occurs, and then 


%_ 
cr «, 


N 


r / n r 

J/ e 


i 


V(2*) 


-Uvlcr,)' 


dy 

'hm\- 


e-W'dy 


oo _ 

e 


vl<r 2 


ivz 

2^ d y 


The numerator is z and the denominator J(l — cc) m the 
notation used in Sheppard’s tables of thq probabihty integraL 
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The working of the numerical example may help to make the 
method clearer, it is as follows 

The mean weight of all the boys is . . ... 27*7522 

The standard deviation is ... . . 6 7502 

c 

The mean weight of boys with bad glands 1 ^ 27*3737 

j(l-oc) = 271/545 = *4972 
J(1 + a) — 5028 and this value corresponds with 
z = 3989 m Sheppard’s tables 

The correlation of glands and weight is 

(27 7522-27 3737) / 3989 
6*7502 / *4972 

= 070 

It may be remarked that the use of |(1 — cc) assumes that 
the column with the smaller total frequency will be taken, thus, 
m our example, there are fewer boys with bad glands than with 
good glands 

10 . This example suggests a practical point, namely, that, 
before actually working out a coefficient of correlation, it is 
advisable to look at the statistics and form a preliminary idea 
of whether there is any correlation. In tables such as that on 
p 142 there is no correlation if all the means of the rows are 
alike and all the means of the columns are alike. Similarly, 
m tables, such as the one on p 170, if the entries within the 
table are proportional to the totals there is no correlation In 
the example m § 9 above a comparison of the two inner columns 
with the total column shows that if there is any correlation it 
must be small because the distribution in the total would give 
a possible “ graduation 5 ’ of each of the inner columns. 
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CHAPTER X 


♦ STANDARD ERRORS 

1 . In statistical work we calculate a mean from a number of 
measurements, and we may be tempted to think that our work 
has definitely established the mean with which we are con- 
cerned The arithmetical work may be correct m every detail 
and the measurements may have been made accurately, but 
the mean found from the statistics may differ from the true 
mean of the character measured because the things measured 
are limited in number — because, in other words, the sample we 
• have taken does not exactly represent the unlimited popula- 
tion from which it is drawn If we toss five coins and record 
the number of heads, we should obtain a table like the following 
where we give results of 140 repetitions of the experiment * 


Number 
of “heads” 
m tual 

Number of trials 
m which the num- 
ber of heads m 
previous column 
was recorded 

5 

4 

4 

24 

3 

49 

2 

40 

1 

20 

* 0 

3 


140 


Now, treating this as a mere statistical problem, in which we 
do not know a priori anything about the true distribution, we 
may work out the mean number of “heads’ 5 as 2*6 This does 
not prove that this mean and no other can arise, a second 

* Such an experiment may, alternatively, be regarded as drawing a random 
sample of 140 cases from an infinite population distributed as (J-f£) 5 
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experiment might give a different result and we cannot, there- 
fore, say from our calculation what the mean value really is 
We can, however, approach the problem in another way, we 
can try to decide how deviations from a true mean are likely to 
be distributed andfso form an opimon as to how a mean calcu- 
lated from an experiment or series of experiments will differ 
from the truth For practical purposes we might rest content if 
we could say that the true mean will not differ from the calcu- 
lated mean by more than a small quantity (e) once m a hundred 
trials Before we go into the measures actually used, let us 
consider some of the points m a preliminary way 
All our statistical experience makes us feel sure that an 
experiment based on 1,000 cases must be more reliable than a 
similar experiment based on 50 cases, so we anticipate that e, 
the small difference between the true and the calculated mean, 
will depend m some way on the number of cases Again, a ^ 
distribution that spreads widely gives the mean more op- 
portunity to deviate than a distribution that is concentrated; 
so that we may also anticipate that e will depend on the spread 
of the distribution, that is, on its standard deviation 

2. These remarks apply to all statistical measures The 
measures are inexact and only approximate to the truth, but 
we can say that it is highly probable that they do not differ by 
more than a certain amount from the result which would be 
obtained if we could deal with an unlimited number of facts 
In our discussion we have spoken of means, but every other 
measure is subject to the same general considerations and we 
must, therefore, consider what sort of value may be assigned 
to the small error e for means, standard deviations, coefficients 
of correlation and other measures. We have anticipated that 
this error will depend on a standard deviation, and this some- 
times leads to a little confusion because we must make up our 
minds as to the distribution to which the standard deviation 
refers Let us imagine that we have worked out a coefficient 
of correlation for 100 pairs of, say, ages at marriage of husband 
and wife. Then we wo^k out a second coefficient of correlation 
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for another hundred pairs and go on till we have a large number 
of these results The coefficients will fall into a distribution like 
one of the frequency distributions we discussed in earlier 
chapters and it is the standard deviation of that distribution 
from its mean with which we are concerned^ Similarly, we can 
repeat the con^-tossing experiment of § 1 over and over again 
and calculate the mean from each experiment We may obtain 
any value for the mean between 5 and 0 heads; these extremes 
will only arise in* the most unlikely case when every trial gave 
5 heads or every trial gave no head The most likely mean is 
2 5 and if we repeat the experiment sufficiently we shall form 
an idea of the way the means are distributed we shall reach a 
frequency distribution of means having its own mean, standard 
deviation, etc , and we must not confuse it with the frequency 
distributions such as that in the table m § 1 from which the 
• means were calculated. It will help to avoid confusion of ideas 
if we speak of “ standard error” when we are referring to the 
frequency distribution of a statistical measure (such as a mean, 
coefficient of correlation, etc ) instead of speaking of “ standard 
deviation ’ ’ The standard error is, then, the standard deviation 
of the frequency distribution of the particular measure we are 
examining 

3. With this introduction we may now consider the simplest 
of the bell-shaped frequency curves, namely, the normal curve 
of error, and see what conclusions we may draw if the distribu- 
tion of a statistical measure takes that form It has, in 
fact, been shown to be the form that the distribution of 
statistical measures tends to assume when the number of cases 
in the sample is large. Thus, even if the distribution m § 1 had 
been skew instead of symmetrical, the distribution of the 
means would have been more nearly of the form of the normal 
curve than the skew distributions from which they were 
obtained By reference to the tables of this curve we see that 
the area corresponding to the standard deviation is about two- 
thirds of the whole area, while the area corresponding to twice 
the standard deviation is -9545 of the whole area. 
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In other words, if the distribution takes this form we can say 
than an error of more than twice the standard error will occur 
9 times m 200 trials and is, therefore, unhkely to have arisen m 
the particular case with which we are dealing The diagram will 
help the reader to follow this argument The area between A B 
which is at a distance equal to the standard deviation from the 
mean one side, and A'B' at the same distance the other side, of 
the mean, is approximately two-thirds of the whole area The 
lines CD and C'D', which are twice as far from the mean, 
include nearly the whole curve, the pieces beyond those lines 
are tails which must be of relatively small dimensions 

4. It was formerly the custom to use another function 
known as the probable error, which is 67 449 times the standard 
error The probable errbr gives that value of x (say p) which 
divides the part of the normal curve representing positive 
errors into two equal portions, it is therefore given by 

where the whole area of the curve (positive plus negative 
deviations) is unity In order to find p m terms of the standard 
deviation, we have, therefore, to obtain the value of x } corre- 
sponding to |(1 + oc) = -75 m Tables for Statisticians , Part I, 
Table ii or short table m Appendix IX, where cc is 
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This can be done by interpolating inversely and p is thus found 
to be 67449 approximately The mean, or rather the vertical 
through the mean, divides the whole distribution mto two 
equal parts the probable error divides it mto fourths and gives 
what Galton called the quartiles The position is shown with 
the letters P ai&l P f m the diagram, thrice the probable error 
includes about the same area as twice the standard error 

We may set down the following general rules 

(1) the true value and a calculated value of a mean or other 
characteristic are unlikely to differ by more than twice 
the standard error, 

(2) if an experiment on any subject leads to a result which 
differs from that expected by more than twice the 
standard error we must suspect that we are not dealing 

w with a random sample 

5. The problem before us is to consider how statistical 
measures calculated from limited data may vary about the 
expected values Two methods of approach are available We 
may, as indicated m § 2, make a large number of experiments 
— or collect a large number of samples — and calculate the 
statistical measure m question for each of them. The procedure 
would generally be much too laborious, and we take, therefore, 
the second line of approach Algebraic analysis based on the 
theory of probability enables us to determine the standard 
error that we should find in the limit if the sampling process 
were repeated indefinitely so that all possible samples were 
included in their expected proportions We can often go further 
and determine the actual curve to which the frequency distri- 
bution of a particular statistical measure will tend as the 
number of samples is increased 

We may now take a simple illustration and find the standard 
error of the frequency, say n 3 with which an event will happen 
in m independent trials where p is the probability of it hap- 
pening and q of it faffing The probability of n being equal to 
m, m — 1, 2, 1, 0 is given by the t^rms of the binomial 
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expansion Taking moments about the point represented by 
p m , the first moment is 

mp m ~ 1 q + m(m — 1 ) p m ~ 2 g 2 + . + mq m = mg(p + g) w_1 = mq. 

The second moment about the same point is 

mp m ~~ 1 q -f 2m(m — 1 ) p m ~ 2 q 2 -f | m(m ~ 1 ) (m — 2^ #> m ~ 3 g 3 

+ .. + m 2 q m 

_ m pm-i q _j_ m (m — 1 ) p m ~ 2 q 2 + — — p m ~ 3 g 3 

C ^ 

+ . + mq m + m(m — 1 )p m_2 + m(m — 1 ) (m — 2) ^ m ~ 3 g 3 

+ ... + m(m — l)g m 

= mq + m(M— l)q 2 

The second moment about the mean is, therefore, 
mq J r m{m— 1 )q 2 — m 2 q 2 — mpq 
the standard error — = *J(mpq) <* 

That is to say, if we repeatedly make m independent trials 
the observed frequency of occurrence, n, will vary about the 
expected value mp with a standard error of ^j(mpq) 

6. We may now apply this result to a few examples 
(a) It has been remarked that the number of male children 
born is to the number of female children born as 1,050 1,000, 
in other words, the probability of a child being male is 
1,050/2,050. If 51,350 out of 100,000 children proved to be 
males in a certain community, would it be safe to base on 
the statistics any theory connected with the variation from 
the usual probability 2 The expected resist is 51,220, and 
the standard error is 


100 , 000 . 


1050 1000\ 
2050*2050/ 


± 158 07 


The difference between the actual case and the expected result 
was 130, and as this is less than the standard error, no definite 
conclusion can be based on the divergence from the result 
(b) If the number of cases had been 10,000,000, and the 
actual number 5,1 $5, 000, then the standard error being 
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1,580*7 and the actual difference 13,000, it would have been 
sufficient evidence for the conclusion that the ratio 1,050 * 1,000 
did not fit the particular case. 

(c) If the probability of death within a year is *007, the 
probable error in 200 cases is 67449^/(200 x* 007 x *993) = 80, 
and it would, therefore, be possible to approximate to a loading 
for emergencies if 2 2 was taken instead of 1*4 as the number of 
deaths expected in a year out of 200 cases on risk for a year 
The probable error would, I think, be prefera ble to the standard 
error for this purpose That is, it would not be unreasonable to 
treat *011 as the rate of mortahty mstead of *007 m order to 
obtain some idea of an emergency loading for term assurances 
on the assumption that the number of cases is about 200 and 
the average age is such that *007 might be taken as the pro- 
bability of death in a year. It has also been assumed that 
•it is correct to treat each class as if it were subject to its 

own rate of mortality and had to be treated independently 
of the rest of the business; that is, however, a debatable 
point 

(d) It will be noticed that if m remams constant, then 
A J(mpq ) has its largest numerical value when p — q = which 
shows that an insurance office will generally find that if it has 
two classes of equal size, and one is subject to a higher rate of 
mortahty than the other, the former will have the larger actual 
deviations from the expected number of claims, because the 
probability of dying in a year only reaches the value \ at the 
end of the mortality table 

7. We may now consider a frequency distribution divided 
into k groups such that the proportion of cases in the 5 th group 
is p s and, clearly, Pi+p 2 + •• ** +Pk— 1 If we a 

case at random from this distribution, the chance that it comes 
from the 5th group is p s and the chance that it comes from some 
other group is q s = 1 — p s . Let us suppose that m cases are 
taken at random and that n s of them fall m the 5th group Then, 
though the expected value of n s is mp s , this frequency may 
assume values m, m— 1, 2, 1, 0 with probabilities given by 
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the terms of the binomial (p s + q s ) m and the standard error of 
n s will be 

••••(!) 

If, m practice, we do not know the exact form of the frequency 
distribution fromf which the sample has been taken, we may 
approximate to the standard error by putting p s = n s /m, the 
observed proportion m the «sth group of the sample Hence, we 
have, approximately, 

Cne = VW 1 -^/™)} ( 2 ) 

8. As the total of all the frequencies n v n 2 , . -n k is m, it 
follows that, if m a particular sample n s is much greater than 
mp s: the other frequencies must on the average be too small 
and this shows that the errors between the groups are corre- 
lated The next point to be investigated is the amount of the 
correlation between deviations m the frequencies of the 5th andP 
Zth groups 

The deviation, 8n s , of n s from its expected value is n s — mp s 
As we are considering the relation between deviations m n s 
and n t we may conveniently class together all the remaining 
k — 2 frequency groups into a single remainder group, say, n R 
Then 

n s + n t + n R = m 

Ps+Pt+P R = 1 
8n s + 8n t + 8n R = 0 

and (8n s + 8n t f = ( ~ 8n R ) 2 

or 8n s 8n t = \{8n\ - 8n | — 8nf) 

If we now imagine that a very large number, N, of random 
samples is taken and the expressions on both sides of the 
last equation are summed and divided by their number, N, 
then 

~S(8n s Sn t ) = ^il S (dn%)-±S($nl)-±S(Sn^ 

The expressions on t|ie right-hand side represent, in the limit, 
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the squared standard errors of the group frequencies, given in 
equation (1) above Hence in the limit 


i S{dn s Sn t ) = ]>m{p R (l - Pr ) - Ps ( 1 - Ps ) - Pt (l - P( )} 

= \m{{\ - Ps - Pi ) (p s +p t ) -p s (l*- Ps )- Pi (l - Pt )} 

= ~ m PsPt • --(3). 

But the correlation between n s and n t is 

* 1 

Limit of — S($n s Sn t ) 




PsPl 


J{Ps( 1 -Ps)Pt(l-Pt)} 

Ps Pt 

(1 —Ps) (1 ~Pl). 


(4) 


We may again approximate to this expression by substituting 
for p $ and p t the proportionate frequencies, njm and %/m, 
of the sample 

9. To find the standard error of the mean of a sample of m 
observations 

Let us again assume a frequency distribution divided into 
k groups where x s is the value of the variable quantity x 
associated with the sth group. For the reasons already ex- 
plained m the earlier Sections of this Chapter we must distin- 
guish between ( 1 ) the mean of the population represented by 
the frequency distribution, namely 

X = Z(p s x s ) 

where U indicates summation for all the k groups, and (n) the 
mean calculated from a particular sample of m cases drawn at 
random from this population, namely 


The standard error of x 3 say will provide a measure of the 

( 189 ) 



extent to which the mean of the sample may differ from the 
mean of the population The value of cr 5 may be found by 
using the results (1) and (3) of the preceding sections 
Using a similar notation, we have 
r 8x = x — X 

1 r 

= ^( Sn s x s ) 

As the expected value of 8n s is zero, the expected value of Sx 
is zero, or the mean value found from repeated sampling of the 
mean of the sample is the same as the mean of the population. 
Squaring both sides of the last equation above, we have 

(&c) 2 = {£(Sn*xl) + 2Z'(8n s dn t x s x t )} 

1YI if 

where S' indicates summation for all pairs of values of s and t 
for which s is not equal to t 

If we now assume a large number, N, of samples to have been 
taken and the corresponding values of (&r) 2 summed and the 
result divided by N , we obtain 

^S(Sx) 2 = ^ {r(^«)) + 2Z' (* s ^5(K<K))} 

where S denotes the summation m respect of the N samples 
The left-hand side of this equation is the squared standard 
error of the mean of the sample, or <r~ On the right-hand side 

-j^S(Sn%) is the crf la of equation (1) and S(8n s 8n t ) is given m 

equation (3). Hence 

-p,)l - 2 r(x 8 Xtmp a p^} 

III 

= -(/4-z 2 ) 

m r 
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where ju,' 2 is the second, moment, about the origin for x, of the 
distribution of the population But /i 2 — X 2 — cr%, therefore 

( 5 ) 

We thus find that the standard error of the mean is the ratio of 
the standard deviation in the population to the square root 
of the size of £he sample. 

10 . This last result is of considerable use m statistical work. 
A large number ^of cases is recorded and the mean used to 
compare the particular experiment with another of a like kind. 
Is an actual difference between the means due to some cause 
other than random sampling? A practical apphcation would 
be the comparison of the average profit from various classes of 
busmess for a number of years. The standard error of the 
profits in the various years would be obtained by taking the 

square root of the second moment about the mean and dividing 
it by the square root of the number of years, the quotient 
would give 0*5 of (5) It is only by usmg the standard errors 
(or probable errors deduced from them) that we could say 
definitely whether a lower average profit m a certain part 
of the business was due to chance or to some causes requiring 
removal 

11. In § 5 of this chapter it was mentioned that we can often 
determine the actual curve to which the frequency distribution 
of a statistical measure tends We saw, m Chapter IV, that 
and /? 2 could, with the mean, be used to fix the frequency-curve 
if it is of the Pearson family of curves, and it follows that if we 
can find /? x and /? 2 ¥or the frequency distribution of a statistical 
measure we shall have gone a long way towards fixing the form 
of the curve. If we write jS x (x) and fi 2 (x) as the moment ratios 
for the population distribution of x and fi x (x) and j3 2 (x) for the 
distribution of x (the sample mean) m repeated samples of 
size m 9 then it can be shown (see R. Henderson, J Inst. Actu 
xli, 429) that 

A(*) = Al(*)/» [ 

&(*) = 3 + {/? 2 (£)-3}(m! 
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Thus if the distribution of x is represented by the normal curve 
for which f3 x {x) = 0 and fi 2 (x) = 3, it is seen that 

jS x (x) = 0 and fi 2 (x) = 3 

and the distribution of x is also normal Even if the distribu- 
tion of x is not normal, it follows from equations (5) that j3 x (x) 
approximates to zero and j3 2 (x) approximates to 3 if m is not 
too small 

12 . The standard error of a standard deviation may be 
taken as ijf J — "J f or l ar g e samples and when the dis- 
tribution of the population approximates to the normal curve 
of error (when ft 2 (x) = 3) the standard error becomes crj^](2m). 

Another standard error which is often useful relates to the 
difference between two percentages or proportions Thus if we 
make m x trials and the event happens n x times and m an mde-^ 
pendent m 2 trials we find n 2 happenings, mwhat circumstances 
can we conclude that p x = p 2 where the sample estimate of 
p x is n x \m x and of p 2 is n 2 jm 2 ? The solution might be useful 
when two rates of mortality, withdrawal or sickness are being 
compared. 

If p x = p 2 — p, say, the standard error of the difference 
n x lm x — n 2 lm 2 is *j{p(l —p) (I/mi+ l/m a )} We do not really 
know p, the underlying proportion to which the p x and p 2 of 
our experiments approximate, but on the hypothesis that there 
is a common value we may make an estimate of it from 

(n x + n 2 )l(m x + m 2 ) 

This leads to a standard error of 

l\ n x + n 2 / n x + n 2 \t 1 [ 1\| 
a J \m x + m 2 \ + m 2 / \m x mj j 


As an example we may take (1) 1000 cases with 22 withdrawals 
giving a rate of withdrawal of *0220 and (2) 600 cases with 
19 withdrawals giving a rate of 0317. Is the difference 0097 
significant? The combination of the two experiences gives 
41/1600 or *0256 as the rate of withdrawal. The standard error 
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by the formula last given is *0225. The difference is not signi- 
ficant. If however the numbers had all been twenty times 
greater, the standard error would have been *005 and it would 
require httle additional evidence to satisfy us that the difference 
is significant * 

13 . In similar ways it is possible to find the standard errors 
of the moments and constants, but this leads to the more 
theoretical parts of the subject with which it is inadvisable to 
deal m a book of this character It is, however, necessary -to 
call attention to the standard error of the coefficient of 
correlation owing to the importance of that function m 
statistical work. 

As m the case of the mean, it will help to avoid confusion if 
we use a symbol, p , for the correlation coefficient in the popu- 
lation itself different from the symbol, r, for the coefficient 
• calculated from a particular sample of m pairs of observations 
From one sample to another r will vary about p and it has been 
shown that the standard error of r is, for large samples,* 

cr r = (1— p z )/*J(m— 1) approximately ... (7) 

If we do not know p we use r as an approximation to it. 

14 . This result was first given with <Jm in the denommator 
by K Pearson and L. N. G Filon as an approximation when 
m is large (Philos Trans A, cxci, 231-41). Later R. A Fisher 
(see Biometnka , x, 507-21) obtamed the exact distribution for 
r when samples are drawn from a population following the 
normal correlation surface of p 159 above The closeness of 
the approximation by formula (7) as well as the form of the 
sampling distribution of r in such circumstances can be studied 
from tables given m Tables for Statisticians, Partn, Table xxxh 
or j Biometnka, xi, 328 et seq. It can be seen from these tables 
that lfp = 0, formula (7) gives a good value for cr r even for very 
small values of m, but as p becomes larger the approximation is 
less satisfactory partly because the formula does not give a 
close value and partly because, even if cr r be found closely, the 

* When m is large *Jm can be used for *J(m- 1) hcsre and m similar formulae 

EFC ( I93 ) 
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distribution of r is such that + and — deviations are not 
equally likely and the usual rule that twice the standard error 
covers nearly the whole field may not apply. It is difficult to 
give a more definite statement but it may be of help to say 
that if m > 400 formula (7) can be used * If however m = 100, 
care is needed in interpreting cr r unless p is lesrthan *5, and if 
m — 50, unless p is less than 3 

R A Fisher suggested ( Metron , i, 1921) an useful trans- 
formation to 

l{ 1 °ge( 1 + r )-log e (l-r)} 

which is distributed normally with a standard error of 

l/V(m-3) 

whatever the value of p 

- 15. As an apphcation of formula (7) we may take the 
example m Chapter VII where we found that the coefficient 
of correlation between the age at maturity and the unexpired 
term of endowment assurances is *254 It is not right however 
to assert that this coefficient exactly represents the correlation 
the real measure may be greater or less, and considerations 
arise similar to those exemplified m § 6 But there is another 
point m connection with a coefficient of correlation — we cannot 
even say that there is any real relationship till we have 
examined the standard error In our example m — 2870 and 
r= 254, so that <r T = ± 016. In this case, therefore, the 
standard error is so small that the result is reliable, but if we 
had found r = -073 with a standard error of 05 it would have 
been impossible to say definitely that the correlation had not 
arisen merely from chance. 

16 . This brings us to an important application of the 
standard error m formula (7) which can be made safely even 
when m is as small as 30 If there is really no correlation, then 
p = 0 and the expression m (7) reduces to 

o- r =l/V(m-l) . ..(8) 

* For m=400, /> = 9, we find oy = 00957, by formula (7) cr r is 00951, the 
distribution of r is described by mean r— 8998, inode= 9011, &= 07402, 
jS a =3 1342. This can only be roughly represented by a normal curve 
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Thus, to go back to the example in Chapter VII, if we assume 
that there is no correlation, r—p — -254 with a standard error 
of 1/^/2869 or -0187 The difference, r—p, is well over twelve 
times the standard error , it is therefore almost impossible that 
the correlation was zero m the population from which the 
sample of 287& may be supposed to have been drawn 

17. Formula (7) above is appropriate only for a coefficient 
of correlation calculated by the method described m Chapter 
VII In using the fourfold table the standard errors are larger, 
as would be expected, because the groupmg is rougher, and the 
formula by which they should strictly be calculated becomes 
comphcated The formula referred to gives as the standard 
error of r, 


1 


where 


j(a-hd) (c + b) t (a + c) (d + b) t / 2 (a + b) (d + c) 

' ' 2 ^2 + V / 1“ 


\ 4iV 2 


N 2 


X = 


, , ad — be , ab-cd 

+ 2^2-p ^2-^2-- 

1 


fl 


ac — bd\ 

~w~\ 


27r^/(l-r 2 ) 

h—rk 


g—(h 2 -\-k 2 —2rhk)!2(l—r 2 ) 




v(l -r») 



e~- x ~dx 


e~ ix ~dx 


and it is assumed that the fourfold table is so arranged that 
a + c>b + d and y + b> c + d, where a, b, c, and d have the 
me amng s indicated on p 173. The numerical work for finding 
the standard error of r for the example in Chapter IX is as 
follows ft-rf, 

1 — f V(1 r) e~ ix ~dx 


fi = 


V(2tt)J o 


-ml 


56821 


e~- x2 dx — *21505 


by Tables for Statisticians , Part I, Table n 

k—rh 


f 2 


V( 27r )I' 


V(l— r 2 ) 
0 


l r 322 
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e~- x 'dx = *12639 
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1 r -(h 2 4-k 2j t-2hkr)l2(l—r 2 ) — Z ^-86744 

X ~ 2n<J(l-i 2 ) 277X 63900 

= -10462 

log I = -98039, log = 1-33254 and log = 1-10171, 
X n 

.*. the standard error of r is 

•1046 2 V(4703j ^ 02283 + -00»5 + 004,9 

+ *00252 — -00408 — 01015}= ± 018 

1 8. The standard errors found by this method are larger than 
would result from formula (7) and m many cases are as much 
as three times as great — this actually happens in our example 
The correct formula is rather troublesome, but Tables for 
Statisticians, Part i, Tables xxni and xxiv, based on an 
approximation, minimise the arithmetical work The approx- 
imation can be safely used except when the divisions of the 
correlation table differ extremely 

19. It will be noticed that, as we anticipated, all the expres- 
sions for the standard errors contain the square root of the 
number of cases m the denominator We anticipated m the 
first paragraphs of this chapter that the standard error would 
decrease as the number of cases increased and we can now say 
that m each of the cases discussed the standard error varies 
inversely with the square root of the number of cases The 
student should make it a rule to work out standard errors and 
he will find that much labour can be saved by using tables, 
usually of “probable errors’ 3 , that have been published m the 
Tables for Statisticians 

The object in calculating standard errors is to prevent 
ourselves from reading too much from the means or other 
measures we have calculated, but we must not run to the 
opposite extreme and rely more on a standard error than the 
theory justifies Thus, at certain points, our theory has assumed 
that the characteristics are distributed m a form approximating 
to a normal curve o£ error, and a good deal of evidence has 
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been produced showing that this is a reasonable assumption 
for many characteristics when the number of cases exceeds 30, 
or for some characteristics with even smaller numbers The 
assumptions imply that plus and minus errors are equally 
probable, but it would not be right to asseit that the means of 
a sample of a*J -shaped distribution are equally likely to fall 
above and below the true mean within twice the standard 
error, and formulae (6) above help to indicate this limita- 
tion 

20. We may now refer briefly to some practical points m 
“sampling 55 The essence of samphng is that we form an 
opimon of the whole by examining a sample of it, and error 
may arise (1) owing to bias m making up the sample or (2) 
owing to the particular sample giving a wide deviation from 
the whole because it is based on a small number of cases. 

It is, usually, not difficult to guard against bias m actuarial 
or sociological practice For instance, if we require to estimate 
the mortahty of lives assured we might collect information 
merely m respect of persons whose names begin with A This 
would give fewer cases, but there is no reason to suspect that 
such lives differ from those whose names begm with the other 
letters of the alphabet The selection of a particular letter might, 
however, lead to suspicion if it could introduce a question of 
race m a mixed community, e g. in Alsace-Lorraine, if we 
worked with people whose names begm with W we should 
exclude those of French extraction but include those of 
German extraction An alternative is to take one case m, say, 
each hundred, e g the mortahty of hves assured could be 
mvestigated by examining from the registers of the insurance 
offices every hundredth case 

Samphng of this kind is useful in social investigations where 
we may, perhaps, want to examine the home conditions of 
school children and cannot hope to get from every home 
particulars of the health, occupation or habits of the residents 
We might, however, be able to make an exhaustive examina- 
tion of 2,000 or 3,000 cases. With a freehand it is not difficult 
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to obtain a random sample, and a little thought and common 
sense is all that is required 

The other risk of error lies m the fact that we have only a 
small sample, and it is here that the subject is connected with 
that of cc standard errors 5 \ If we may assume that the sample 
is chosen at random and, though not of itself smadl, is small com- 
pared with the population from which it is drawn, then we can 
follow the methods indicated m the earlier part of this chapter 
21. Special circumstances, however, arise m some experi- 
ments, and one type of case may be specially mentioned It is 
frequently necessary to test the comparative yields of different 
varieties of the same plant. The trouble m such a case is that 
plots placed far apart even m a small field produce widely 
different results, but small adjacent plots resemble each other 
In order to make a fair comparison we ought, therefore, to have 
a number of pairs of adjacent plots The comparison is made ^ 
between a number of pairs and we are concerned with the 
differences between these pairs and must work out 

0.2 = s (y~ x ) 2 

m 

where m is the number of “ pairs ”, and x and y are the corre- 
sponding members of a pair measured from their means 
It is important to distinguish this sort of case, where the 
pair formed from adjacent plots is the unit, from the different 
case where we draw a sample of m 1 observations from one 
record with a standard deviation of cr 1 and a second sample of 
m 2 from another record m which the standard deviation is <r 2 
In this case the standard deviation of the difference between the 
two means is given by 



m 1 m 2 


This assumes that there is no correlation between the variables, 
but in the “pairs ” problem we have arranged “ pairs ” because 
we expect correlation. Algebraically the correlation is indi- 
cated by the xy term pf S(y — x ) 2 = S(y 2 — 2xy -f x 2 ) 
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It will be appreciated from what has been written elsewhere 
m this chapter that it is assumed that the samples are suffi- 
ciently large to justify the assumption that the <r’s calculated 
from the samples can be treated as the standard deviations of 
the population. * 

The use of* the wrong formula may lead to erroneous 
conclusions, the actual difference between the means may be 

Sit/ 

30, the standard error by o* 2 = — - — — — may be 6, and by 
0-2 0-2 

0-2 _ __1 + __2 ma y b e Judged by the former the difference 

771 -^ 777 g 

is almost certainly significant, judged by the latter it is 
doubtful 

The kind of problem indicated might arise whenever it is 
necessary to compare the results of alternative methods m 
• changing conditions, and the theory which was worked out 
primarily to test yields may prove valuable elsewhere. 


* 
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CHAPTER XI 


0 

THE TEST OF GOODNESS OF FIT 

1 . When the values of ordinates and areas were calculated 
lmthe examples of the various types of frequency-curves, no 
systematic attempt was made to test the graduations m order 
to ascertain whether the results obtained were reasonable. 
Actuaries have generally been m the habit of imposing on the 
graduated values of any table on which they may have been 
working, rough checks which have amounted to a comparison 
of the totals m various groups and an inspection of the changes 
of sign in the differences between the graduated and un- ^ 
graduated figures The problem of the goodness of fit needs, 
however, more accurate treatment , for inspection, even when 
aided by the calculation of a standard error for each group, can 
only tell that certain differences are large, and if the standard 
error be exceeded in two or three cases, it is impossible to say 
whether the excesses are m any way balanced by equalities m 
the rest of the graduation A test is required which will give 
some measure of the disagreement as judged by the whole 
graduation 

2. Now, if there be N observations distributed m n 

groups, the numbers m the group being m ^ m 2 , . ., m r ni we 
have to find a criterion to enable us to decide when the series 
m l5 m 2 , ., m n will be a legitimate graduation. We may 

clearly take a legitimate graduation to be one m which the 
observed values (m') do not differ from the theoretical (m) by 
more than the deviations that would be expected m random 
sampling What we require to know is not the probability that 
the particular series of m n s will occur if the m’s represent the 
theory, but the probability that the m n s, or an equally likely 
or less likely senes, will arise To appreciate the difficulties of 
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the problem we may consider the simplest ease, that of a com- 
tossmg experiment, and suppose that a com has been tossed 
six times and come down 4 heads and 2 tails The £ £ graduation 5 ? 
we make is 3 heads and 3 tails, and to test it we require to find 
the probability of obtaining a result as imlikely, or more 
unlikely than t^e observed one This probability is the same as 
that of getting any one of the following results 

6 heads and 0 tails 
5 „ 1 „ 

4 „ 2 „ 

2 „ 4 „ 

1 5 J ^ 55 

o „ 6 „ 

, It is impossible to calculate such probabilities directly, even 
when the simple probabilities leading to the deviations are 
known, in any but the easiest cases, but when we do not know 
the simple probabilities, or the case is a complicated one, a 
further difficulty is introduced owing to our inability to tell 
from a priori reasoning which of the possible cases are more or 
less likely than that which has actually arisen It would, for 
instance, be difficult to say, without a large amount of arith- 
metical work, when 20 dice were being thrown, whether the 
probability of getting ten “ sixes” or more was greater than 
that of getting two ££ sixes” or fewer, but this is an extremely 
simple case compared with the general proposition m which 
deviations over sf series of numbers have to be considered 
3. If it»is ass um ed in any measurement on one subject that 
the deviations from the mean take the form of the ££ normal 
curve of error”, and it is required to estimate the chance of 
obtai nin g deviations greater than a certain value ( t , say), it will 
be necessary to sum all values of the normal curve beyond t on 
each side of the mean, i e. we must take 

e~* x *dx + f e~- x2 dx — 2 f e~~ x2 dx 

oo J t \l t 
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and divide the result by the area of the whole curve, i e by 
the total deviations Assuming that there are two measure- 
ments instead of one (the exposed to risk, for instance, at two 
ages), the deviations are as it were, m two directions instead 
of one, and it is necessary to take an expression with two 
variables instead of one The expression analogous to the 
normal curve is the correlation surface 


z = z«e 


2 Loy* cr x (T . oyj / 


with which we have already dealt The integrations must be 
performed for both variables from t and t! onwards, and com- 
pared with the total If there are n measurements it becomes 
necessary to deal with a function of n variables, and this will 
give the reader a slight idea of the problem from the mathe- 
matical point of view, and suggest that he will expect the^ 
quotient of two n-fold integrals to give the probability. The 
next step is to reduce these %-fold integrals to the form of 
ordinary integrals, and it has been shown* that the result 


P- 



e~* x2 x n ~' l dx 


dx 


t 


is reached In this expression x stands for a complex function 
depending on the n variables from which the expression was 
evolved, and measures the position that is indicated by the 
probability of the particular distribution, the test for the 
graduation of which is required. 


* Originally by K Peaison “ On the criterion that a given system of deviations 
from the probable m the case of a correlated system of variables is such that 
it can be reasonably supposed to have arisen from Random Sampling,” 
Phil Mag , July 1900 A short proof has been given by H. E Soper m “Fre- 
quency Arrays” 

t A table of P for all values of n'=n + l from 3 to 30, corresponding to x 2 
from 1 to 30, with a few additional values and auxiliary tables for the calculation 
of further values, is given in Tables for Statisticians , Part I An abridged table 
is given m Appendix IX n 
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4. Before a measure of the probability P can be obtained a 
value for x must be found from the statistics of the particular 
graduation, and m the paper to which reference has already 
been made its value is shown to be such that 

, *_ s {ss=sa!} ' 

It is natural, almost necessary, to use the square of the 
difference m order that negative differences may, equally with 
positive differences, increase the improbability of the system, 
while a ratio is required to bring into account the size of the 
group, for an error of 15 m a group of 20 would be very large, 
but m a group of 1,000 would be negligible 

5. The practical aspects of the test of goodness of fit and its 
application may now be dealt with 

* ( 1 ) If the facts representing the graduated and ungraduated 

figures are only available m groups, then the value of the pro- 
bability by the test will, as a rule, be lower as the number of 
groups is increased This practical point should be borne m 
mind as it sometimes happens that graduations are tested m 
groups of, say, 5 years of age, but the graduated figures for 
individual ages are then used unreservedly, though, strictly 
speaking, they may be no better than interpolated values. 

(2) The test assumes a distribution, and would not be 
applicable if the numbers were a series of ordinates, though the 
application of the test would probably give a fair idea of the 
goodness of fit if a large number of ordinates had been given in 
the series. 

(3) Thralls of the experience will be very small and never 
fit exactly We ought to take our final theoretical groups to 
cover as much of the tail area as amounts to at least a unit of 
frequency m such cases 

(4) If the number of observations be multiplied by t, say, 
and the deviations are also multiplied by t, then the value of 
X 2 will be multiplied by the same figure, and the test will show 
that the fit is worse. This may seem strange at first, but a 
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little consideration will show that it is reasonable As a large 
number of cases will give smoother series than a small number, 
it follows that if two results are proportionally the same m 
two examples having the same theoretical distribution but 
different total frequencies, the one with greater frequency is 
less probable than the one with less frequency rfThe probabihty 
of a result as bad as, or worse than, three heads and one tail m 
coin-tossing (two heads and two tails being the theoretical 
result) is 625 , but the probabihty of a result as bad as, or worse 
than, 3x2 = 6 heads and 1x2 = 2 tails is *289 It follows 
that if a distribution is based on, say, 103,480 cases and the 
figures are reduced to a total of 1,000 to show the distribution 
of the cases, then a graduation tested as if 1,000 were the total 
frequency will give the impression that the graduation is far 
closer than it really is 

(5) I have found, m applying the test, that when the num-^ 
bers dealt with are very large the probabihty is often small, 
even though the curve appears to fit the statistics very closely 
The explanation may be that the statistics with which we deal 
m practice nearly always contain a certain amount of extra- 
neous matter, and the heterogeneity is concealed m a small 
experience by the roughness of the data The increase m the 
number of cases observed removes the roughness, but the 
heterogeneity remains The meaning, from the curve-fitting 
point of view, is that the experience is really made up of more 
than one frequency-curve, but a certain curve, approximating 
to the one calculated, predominates Another possible ex- 
planation is that our solution of the problem depends on the 
assumption of a mathematical expression which ddeTs not give 
exactly the distribution of deviations and when we deal with 
a large experience the approximate nature of the assumptions 
is revealed 

(6) What is the actual value of P at which a good fit ends 
and a bad one begins 2 It is impossible to fix such a value We 
have merely a measure of probabihty for the whole table, and 
if the odds against the graduation are twenty or thirty to one 
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the result is unsatisfactory; if they are ten to one the gradua- 
tion is not unreasonable, but the exact value when a result 
must be discarded cannot be given As, however, it is clearly 
impossible to imagine any test which can fix an absolutely 
definite standard, there is no reason for ^objecting to the 
particular method because it fails to do so 

(7) It is sometimes thought that the introduction of 
additional constants must necessarily improve the fit of a 
curve. It may do so m some cases, but it is quite possible fo 
take a curve with ten constants and find it gives a worse result 
than another having only three Besides this, there is the 
possibility of undergraduation, we must not expect to reach 
a very high value for P, e g 95 If we make an experiment m 
coin-tossing, it is unlikely that a single experiment will give 
a distribution very close to the theoretical. If therefore we are 

•estimating the probability of getting that result or worse, we 
shall only rarely get a very high or a very small value for that 
probability We shall do so occasionally, but we must not 
expect it and it is wise to look for explanations when any 
graduation gives a very high or very low value of P 

(8) It may sometimes be advisable to use a curve giving a 
worse agreement than another for simplicity, or for reasons 
such as those which prompt actuaries to employ Makeham’s 
hypothesis. 

6. In a paper “On the Comparative Reserves of Life 
Assurance Compames, etc ” ( J Inst. Actu. xxxvn, 458-9), 
George Kang remarked that it is permissible to use the H M 
Model Office for tTie 0 M , and it will be interestmg to apply the 
formula^iven above to see what is the probability of the 0 M 
distribution if the H M be taken as the theoretical distribution. 

In the table on p 206 there are ten groups, and x 2 = 1*79, 
and Tables for Statisticians give P = *999438 and *991468 when 
X 2 = 1 and 2 respectively It is not, however, sufficient to test 
for 100 new pohcies 950 would reduce the probability to 
about *05, which means that in only one case out of twenty 
would a random sampling lead to a system of deviations from 
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the H M as great as that shown by the 0 M This result will 
remind the student of the great danger of dealing with 
percentages without considering the actual number of cases 
investigated Kang’s other table, which is of greater importance 
in his work (policies according to attained age), shows a much 
closer agreement, as P = -831051 for 10,000 c^es 


Central 
age m 
* group 

Policies issued arranged 

IN AGE-GROUPS 

O' 1 - 

H >r 

(Square of 

H* 

0 M 

+ 

- 

20 

6 97 

7 30 

33 


02 

25 

17 75 

20 45 

2 70 


41 

30 

2104 

23 11 

2 07 


20 

35 

18 41 

18 40 


01 

00 

40 

13 82 

13 05 


77 

04 

45 

9 45 

8 44 


101 

11 

50 

6 23 

5 07 


1 16 

22 

55 

3 51 

2 58 


93 

25 

60 

1 97 

1 20 


77 

•30 

65 

85 

40 


45 

24 


100 00 

100 00 

5 10 

5 10 

x m 

\\ 

i— ' 

CD 


7. We will now revert to §2 of this chapter where in stating 
the problem it was said that the N observations were distri- 
buted in n+l groups. As we have only N observations to 
distribute we can only choose n groups, for having fixed those 
n groups the last one is necessarily fixed, freedom of choice is 
restricted to this extent, and m any problem where the method 
is used the number of groups where freedom of choice is possible 
must be borne m mind This is implied m the proofs leading up 
to the formulae which have been given Now^ following on this 
argument the reader may ask whether it is fair m comparing 
a Type I and a Type III graduation of certain material to use 
the same value of n when there are four constants necessary 
to reach the former and three to reach the latter He may ask 
“are we not really restricting our freedom of choice more m the 
former case than the latter because, to take an extreme case, 
we should reproduce a distribution of only four groups exactly 
with Type I and alter it, that is have freedom of choice, if we 
use Type III 
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8. Before we deal with, this question we may explain that in 
the previous paragraphs of this chapter two distinct problems 
have been covered by the one word “graduation”. These 
problems are 

I Given a theoretical distribution, to ascertain the probabi- 
lity of getting s*i actual distribution or an equally likely or a 
less hkely one 

Here is an example which compares the theoretical number 
of “heads ”, when six coins are tossed, with an actual distribu- 
tion. 


No of 
“heads” 

Theoretical 

Actual 

(2) -(3) 

Square of 
(4) 

<5)/(2) 

(1) 

(2) 

(3) 

(4) 

(5) 

(6) 

0 

1 

0 

1 

1 

100 

1 

6 

6 

0 

0 

00 

2 

15 

12 

3 

9 

60 

3 

20 

23 

-3 

9 

45 J 

4 

15 

18 

-3 

9 

60 | 

5 

6 

3 

3 

9 

1 50 ! 

6 

1 

2 

-1 

1 

1 00 ; 

! 

Total 

64 

64 


r=oi5 j 


If n f = 7 and x 2 = 5*15, then P = *5 2 

II Given a graduation of an actual distribution, to ascertain 
the probability that the deviations will be the same as or 
greater than those found. 

The answer depends on the number of constants m the 
formula used for graduation If there are r constants we should 
deduct r from the number of groups instead of deducting unit y 
as is, in effect, dori& m the last example for n f = n — 1 The mean 
is used t$p£x the position of the curve and must be counted 
as a constant Consequently we must deduct 3 if the normal 
curve is used (i e one for the total number of cases, one for the 
mean and one for the s d.), 4 for Type III and 5 for the main 
types.* Generally speaking, the same result is obtained if the 
number of moments used in the calculations be deducted. This 

* In Tables for Statisticians , Part i, n'~n - 1, and one is, therefore, already 
deducted It follows that n' - 2 would be the number to be used if the normal 
curve has been used for graduating * 
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gives the theoretical answer to the question raised at the end 
of § 7 above It is not always easy to interpret the number 
of moments m applying the rule thus we choose between 
Type III and Type V by usmg the fourth moment, though 
there are only th^ee moments needed to find the constants 
Again if m Type I the start of the curve is fixed? three moments 
only are used, while if the range is fixed, only two moments 
are used (and m effect the number of constants is similarly 
decreased) If we make a rough attempt at a graduation by a 
Type I curve usmg four unadjusted moments and then vary 
the start of the curve as indicated m Chapter V, § 10, then the 
final graduation only uses three moments It can be argued 
that the full number of constants has been assumed and four 
moments have really been used 

9. The example given for Problem I m §8 can be used to 
explain the point mentioned m § 5 about undergraduation We ^ 
may, on a particular occasion, reach an actual distribution 
identical with the theoretical x 2 will then be zero and P will 
be umty Similarly we may reach a distribution so far from the 
theoretical as to seem well-nigh impossible. One of these 
exceptional cases may appear and if we repeat the experiment 
long enough we shall get distributions giving all values of P 
Similarly with graduation, we are unlikely, if we know the 
right form of curve, to find a value of P that is infinitesimally 
small or very near to umty, but neither is impossible 

10 . When we merely want to compare several graduations 
of the same distribution we can often stop our work after the 
calculation of x 2 Thus if we make graduations by Type I using 
various adjustments or compare them with Type A’TTr Type B 
usmg the same number of constants, the lowest value of x 2 
shows the closest graduation. Even if the number of constants 
differs, the value of x 2 shows which graduation is actually 
closest and for some actuarial work this may be more important 
than the study of the probabilities 

Bearmg in mmd that there are difficulties in interpreting 
the number of degrees of freedom m some cases, we may 
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consider what is implied when we use the solution of Problem I 
for Problem II. All the old applications of the (P, x 2 ) test were 
made m this way. In such circumstances we are saying, in 
effect, that the graduation is a theoretical distribution not 
necessarily obtamed from the actual distribution but by 
general reasoning or from other previous experience, and that 
we are measuring the probability of divergences from that 
theoretical distribution as great as or greater than those of 
the actual distribution 

The points set out m these paragraphs are mentioned 
because it is well to be remmded that we must not read into a 
good general test of graduation a refinement which is neither 
justified by the underlying theory nor required m practical 
work. 

1 1 . Reference may here be made to a test of a graduation of 
*a mortahty table The data are expressed as “exposed to risk 55 
(E x ) at each age (or group of ages) and “deaths 55 {d x ). A 
graduation of the rates of mortahty is made and the “expected 
deaths 55 (d x ) are calculated by multiplying the values of E x by 
the appropriate graduated rates of mortahty (q x ). We have, 
therefore, graduated the series 

@X+1> ^X+l ~ @X-hl 5 ®tc. 

by O'x, ®x-6'x> Q'x+V E x+l- 0 x+l> etc 

The E x is fixed m each pair, so, if there are 40 ages, there are 
only 40 degrees of freedom, not 80. But the x 2 should be 
calculated from all the 80 values, although when E x is large 
relatively to 6 , as it is at nearly every age, the E — d terms give 
zero eleniBftts. It will be easier for the reader to follow this 
argument if he bears m mind that the total of the # 5 s need not 
be reproduced exactly by the 6 n s Deduction will have to 
be made from the 40 degrees of freedom for the number of 
constants used in the graduation 
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CHAPTER XII 


THE CORRELATION RATIO— CONTINGENCY 


r 1. We have seen that we can reasonably use the coefficient 
of correlation when regression is linear, that is when the 
means of the columns (and the means of the rows) are ap- 
proximately m a straight line, but m other circumstances 
its use is open to objection In the present chapter other 
methods are described which are not open to the same ob- 
jection. We shall deal first with a function known as the 
“ correlation ratio 5 ’ (y), which is a useful measure m some*' 
cases 

The value of v) yx is given by 


o #K(y a -y) 2 } 
lvx Ntf 


.... (i) 


where y x is the mean of the y’s corresponding to the particular 
array x, n x is the number of cases m the array x, N the total 
frequency, cr y the standard deviation of the y ’ s and y is the 
mean of all the y 5 s The summation extends over all the arrays 
In a similar way we can work from the y-arrays and have 


Vxy — 


S{n y (x y -x)*} 


.... ( 2 ) 


These values of rj will not be the same except in the limiting 
case when regression m both directions is linear and then 
Vyx ~ Vxy = r H will be seen that the correlation ratio 7j yx can 
alternatively be expressed as the ratio of the standard devia- 
tion of the means of the y-arrays, each array being weighted 
with the number m it, to the standard deviation of the y’s 
Taking the example on p. 142 we should find tj as follows 
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Mean unexprred 
term in 
each column 

Vx 

Deduct 

mean of whole 
(20 312) 
Vx-V 

(y*-yY- 

% 

n x {y x -yY- 

10 333 

-9 979 

99 6 

6 

598 

13 250 

7 062 

49 9 

« 4 

200 

13 176 

7 136 

50 9 

17 

865 

16 113 

• 4 199 

17 6 

62 

1,091 

17 230 

3 082 

9 50 

584 

5,548 

20 141 

171 

029 

643 

19 

21 877 

+ 1565 

2 45 

1,098 

2,690 

21 665 

1353 

1 83 

388 

710 

21 500 

1 188 

1 41 

60 

85 * 

27 625 

7 313 

53 5 

8 

428 




2,870 

12,234 


9? 2 = *07367 

lyx 2870 x (7 6067) 2 

or 7) yx = -2708 

The figure 7 6067 is the value of cr 2 on p. 149, multiplied by 
5 the unit of grouping 

Working similarly with the maturity ages, we obtain the 
following. 



or 7] xy = *2571 

The arithmetical processes described in §§ 11-13 of Ch. VII 
supply us with most of the figures required 
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2. We may now go back to formula (1) and rearrange the 
denominator. Remembering that the square of the standard 
deviation can be found by squaring the difference between each 
observation, o xy , and the mean (see Chapter III, § 14), we have 

A (Ty = SS(Oj.y V ) 

= SS(o xv - y x f + S{n x {y x - y) 2 } + 2 S{(y x - y) S(o xy - y x )} 

x y x x y 

= SS(o xy — y x ) 2 + Sn x (y x — y) 2 

r X y X 

as the final expression m the previous hne vanishes. 
Consequently 

S{n x {y x -y) 2 } 

V * yx = S{n x (y x -m + SS(o xy -y x ) 2 (3) 

X xy 

S{n x (y x ~~y) 2 } measures the amount of variation between 

X 

arrays, while SS{o xy ~y x ) 2 measures the amount of variation ** 

x y 

within the arrays. Neither part of the denominator can be 
negative, therefore 

1 ^ Vyx ^ 0 

It also follows from (3) that for 9/ 2 to be large S{n x (y x -y) 2 } 

X 

must be large as compared with SS{o xl/ ~-y x ) 2 , m other 

X y 

words, the larger r) 2 becomes, the greater the variation in the 
means of the arrays compared to the variations within the 
arrays. Also the smaller rj 2 becomes, the less important are 
the differences m the means of the arrays 

3. The correlation ratio may be used for three mam 
purposes : 

(a) to measure the relationship between x and y — this has 
already been shown m the example, 

(b) to test whether there is any real difference in the 
array means, y XJ other than what might be expected from 
samphng, 

(c) to test whether it is reasonable to regard the regression 
hne as a straight line. 


( 212 ) 



In dealing with ( b ) and (c) we must suppose that the 
distribution of y for each x array is not far from a “ normal” 
distribution and that the standard deviations of y arrays for 
given x are approximately equal Under these conditions it 
may be shown that, for the test mentioned m ( b ) above, if 
the array meafts, say k in number, m the population are all 
equal, so that the population value of ij 2 x is zero, then m a 
sample of N pairs of values of x and y , 

(i) the expected value of tj 2 , say, 

{k— l)j(N — 1) (4) 

(n) the standard error 

= ^31 V( 2( £- !) (N-k)l(N+ 1)} (5) 


Unless, therefore, the observed rj 2 is larger than, say, y 2 + 2<7y, 
# we cannot feel confident that it is significant, or that the means 
of the arrays in the population differ The distribution of y 2 
is however very skew if the number of arrays is small, so that a 
deviation of twice the standard error has to be viewed as 
indicated in Chapter X, §19. 

Under the same conditions we can show that, for ( c ) above, 
if m the population the means of the arrays (y x ) he on a 
straight line, le regression is linear, and t/ 2 — r 2 = 0, then m 
a sample of N pairs of values of x and y the ratio 


will have (,"-,«)/( 1-r*) 

(l) an expected value of 

(k — 2)l(N — 2) 
(n) ^standard error of 


( 6 ) 


J— 2 J{2(k-2)(N-k)/N} (7) 


and we can then judge of the departure from hnearity of 
regression m the sample by applying a similar test to that m (6) 
4. In the same numerical example (see the first table m § 1), 
k ™ 10, N = 2870 and the values from formulae (4) and (5) 

^ = *0031, 
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We have actually tj 2 — 0737 so that there is a real difference 
in the array means 

If we take the second table and use the test of formulae (6) 
and (7), we find = 06610 so near to r 2 = 06474 that the 
ratio (?7 2 — r 2 )/(l — r 2 ) is *0014. The expected value is 0028 
and the standard error -0014 This table showS linearity The 
first table would hardly have done so 

5. We may now turn to the theory of contmgency which 
gives us another way of approaching correlation and can be 
used when the regression is not linear or when the facts are 
given m a non- quantitative form with a greater number of 
divisions than those of the fourfold tables discussed m 
Chapter IX The principle underlying the theory of contin- 
gency is that a comparison is made between the given table and 
a corresponding table having the same marginal totals but with 
no correlation. The first step, therefore, is to see how to make a 
table without correlation, and a little consideration will show 
that all we have to do is to spht up the total of any column 
in proportion to the distribution of entries m the final total 
column Thus, the first column would be 

Unexpired Term . . . . 2 7 

Frequency with no correlation 6 x ® x iUvrs * * • 

and the remaining part of the table would be formed m a 
similar way Now as each column is formed m proportion to 
the total, the mean of each column must be the same as the 
mean of the total, which shows at once from the definition that 
no correlation can exist m such a table 

6 . The following table shows the figures exhTEitmg no 
correlation m ordinary type, and those actually occurring m 
small type Now, if these two sets of figures coincide exactly 
in any particular case, there is clearly no correlation m the 
table, if they differ slightly there is a slight amount, and if they 
differ greatly there is a considerable amount of correlation, and 
we come therefore to the conclusion that an alternative method 
of finding the correlation between two things is by measuring 
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Central 
unexpired 
term of 
Endowment 
Assurances 

Central ages at maturity 

Total 

30 

35 

40 

45 

50 

55 

60 

65 

70 

75 

2 

1 

1 

3 

1 2 

114 

12 5 

214 

*7 6 

1 2 

2 

56 


2 



2 

26 

6 

14 

6 




7 

4 > 

w 2 

1 0 

3 7 

35 0 

38 6 

65 8 

23 2 

36 

5 

172 


1 

1 

2 

6 

62 

36 

40 

22 

2 



12 

9 

6 

26 

93 

87 8 

96 8 

165 4 

58 4 

90 

1 2 

432 



2 

9 

17 

117 

99 

127 

52 

8 

1 


17 

1 4 

9 

39 

14 4 

135 3 

149 0 

254 4 

89 9 

13>9 

1 9 

665 


3 


6 

24 

145 

155 

237 

84 

11 



22 

1 4 

9 

40 

14 6 

137 2 

151 0 

257 8 

91 1 

14 1 

1 9 

674 



1 


3 

133 

167 

271 

78 

20 

1 


27 

1 1 

8 

32 

11 6 

109 5 

120 6 

205 9 

72 7 

112 

1 4 

538 





9 

90 

123 

231 

71 

11 

3 


32 

5 

4 

1 5 

53 

50 3 

55 3 

94 4 

33 4 

52 

7 

247 





1 

11 

49 

127 

49 

8 

2 


37 

2 

1 

5 

1 7 

15 7 

17 2 

29 4 

10 4 

1 6 

2 

77 







6 

49 

22 




42 

0 

0 

0 

2 

16 

1 8 

3 1 

1 1 

2 

0 

8 






. 

2 

2 

3 


1 


47 

0 

0 

0 

0 

2 

2 

•4 

2 

1 

•0 

0 

1 

Total 

6 

4 

3 

62 

584 

643 

1,098 

388 

60 

8 

2,870 


the difference between the figures m the actual correlation 
table and those that would have arisen if there had not been any 
correlation In Chapter XI we discussed a method of measuring 
the goodness of fit (or amount of agreement) between two sets 
of figures, and this suggests that we might calculate x 2 by 
squaring the difference between each pair of figures m the table 
and dividing the result by the frequency when there is no 
correlation The reason for choosing the figure from the table 
with no correlation as the divisor is that it always has a value, 
while the correlation table may give a frequency of zero 
7. As it is clear that x 2 will give a measure of the association, 
it will be interesting to see the connection between it and the 
coefficient of correlation r, and the folio wmg proof shows that 
if the correlation table can be approximately represented by 
the normal correlation surface, then where the number of 
groupmgs is large 


r = 


( 1 5 2 


l + <f > 2 

= X 2 I N 


( 215 ) 


where 


(8) 




Using the same notation as that of Chapter VIII, the 
frequency with no correlation is given by 

Z' = y c 2 

, ° tor*x<r t 

while that with correlation is f 


Z = 


N 


2n^(l-r i )<T 1 cr 2 


1 1 / 2rxy ?/ a \ 

21-r 5 \o-i 2 cTiCTo ~^cr 2 2 / 


Tken^ = f + ” f +x (l_3l dxdy 


— 00 J — 00 


+-J-„(zr 2Z+zj + 


If 1 r+co /* + 00 V . ^ y'a jjgfl 

"5 81 ~ I - > ^ 


V(i 


=?s/- si - “ H “"W 

n +oo \ 

— oo j 

where *' = x/cr 1 and y’ - y\cr 2 


1 


1 


1 — r 2 /(p+r 2 \ 2 4r 2 

> 1 — r 2 / (1 — r 2 ) 2 

2 


V(l-r 2 ) 


r 2 l 


+ 1 


(1 — ? 2 ) 2 (1-r 2 ) 2 / 

by (vi) of Appendix IV 


1 — r 2 


• 2+1 


or 


1 — f 2 


r = + 


0 2 


1 + ^2 
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8. The result just obtained may be considered a little more 
closely. 

(1) It shows that r must lie between - 1 and -f 1. 

(2) As the value of $5 2 will not be affected by the order of the 
columns (or rows), it is permissible ^interchange them, 
provide*?, of course, the whole column (or row) be moved 
at once. 

(3) The proof shows that r will not necessarily be obtained 
exactly if a very small number of groups is use’d, 
because by using the integral calculus an infinite number 
of groups was assumed 

(4) We also assumed, however, that we were dealing with 
smooth series, but as x 2 is a measure of the goodness of 
fit between the correlation and no-correlation figures, a 

* large number of groups gives undue prommence to the 

chance deviations due to the use of a random sample, 
and the value of r found from that of (j) 2 may differ 
considerably from the value reached by the ^-moment 
Too fine a grouping may give a less accurate result than 
a less fine one 

9. These conclusions are borne out by practical work, and 
any student who cares to go into the subject can find the value 
of r by the two methods from a large table, using various 
groupings, and he will see that the best agreements are 
obtained when the grouping is neither very fine nor very 
rough. But this general remark indicates a difficulty, for the 
student will naturally wonder how he is to group his figures in 
order t<?T8duce them to a suitable number of classes If he is 
dealing with facts distributed according to age, he can take 
groups of ten years instead of the finer groupmg of five or 
three years or he may lump together the small groups at the 
ends He will find that equal frequencies give better results 
than equal ranges when the material is divided into six (or less) 
classes, but when there are more than six classes equal ranges 
should be taken. This rule can only b§ applied broadly: we 
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cannot from the nature of the data make exactly equal groups 
of our frequencies but must be content with somethmg 
approaching equality In order to indicate how we may , 
proceed and how the numerical work is done, the following 
table has been prepared from that of p 215 


r 


Central 

unexpired 

term 

Central ages at maturity 

Total 

50 and under 

55 

60 

65 and over 

2, 7, 12 

154 6 (247) 

147 9 (141) 

252 6 (181) 

104 9 (91) 

660 

17 

155 9 (178) 

149 0 (155) 

254 4 (237) 

105 7 (95) 

665 

22 

158 1 (137) 

151 0 (167) 

257 8 (271) 

107 1 (99) 

674 

27 

126 2 (99) 

120 6 (123) 

205 9 (231) 

85 3 (85) 

538 

32 and over 

78 2 (12) 

74 5 (57) 

127 3 (178) 

53 0 (86) 

333 

Total 

673 

643 

1,098 

456 

2,870 


10 . The totals are not all equal to one another the 1,098 r 
cases maturing at age 60 prevent this, but they are far more 
nearly equal than the totals m the original table. We now work 
out x 2 an d ^d that its value is 198 8 * Hence 


2 _X _ 
N~ 


and the coefficient of contingency is 254. This differs from the 
figures given for 7j m § 1| and both may differ from the r found 
by the method of Chapter VII, the original table does not 
follow sufficiently closely the mathematical form assumed 
There is, however, a general difficulty apart from any pecu- 
liarity of an individual case, for we can never reach a coefficient 
of umty because, with a finite number of groups, never 

become infinite which is necessary if r is to be unity Similarly 
there is a tendency to mis-state the value of r by the method 


* To make this more easy to follow we may mention that the contributions 
to x z from the first column are 55 1, 3 1, 2 8, 5 8 and 56 0 

t It happens to agree with r from Chapter VII In the particular case the 
errors from broad grouping and from deviations from the assumed form happen 
to balance The agreement is an illustration of the danger of generalising from 
isolated cases r 
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of contingency when r has other values and this depends to 
some extent on the groupmg of the material Adjustments 
which are of a fairly simple nature should be made 

11. In § 7 when we worked out the connection between r and 
we assumed that the frequencies took thelform of the normal 
correlation siytface This means that we assumed that the 
totals of the columns and rows are “normal curves of error ” 
Let us suppose that we have no finer groupmg than that given 
m the table in § 9, then the totals of the col umns are 673, 643 
1,098 and 456, making a total of 2,870, or reducing them to 
a total frequency of unity, we have 2345, 2240, -3826 and 
•1589 From tables of the “normal curve”* we can work out 
the ordinates at the end of each group of frequency and form 
the following table 


Group 

fre- 

quency 

(1) for 
unit fre- 
quency 
n 

Total area 
from 

beginning 
(by adding 
(2)) 

Ordinate 

at 

beginning 
of group 
z 

Difference 

of 

z’s 

negatively 

Col (5) 
Squared 

(6)/(2) 

(i) 

(2) 

(3) 

(4) 

(5) 

(6) 

(7) 

From the columns 

673 

2345 

2345 

00000 

- 30694 

09421 

402 

643 

2240 

4585 

30694 

- 08984 

00807 

036 

1,098 

3826 

8411 

39678 

+ 15456 

02389 

062 

456 

1589 

1 0000 

24222 

00000 

+ 24222 

05867 

369 

2,870 

10000 





869 




Square root 

= 932 j 

From the rows 

m 







660 

2300 

2300 

00000 

- 30365 

09220 

401 

665 

2317 

4617 

30365 

- 09345 

00873 

038 

674 

2348 

6965 

39710 

+ 04759 

00226 

010 I 

538 

1875 

8840 

34951 

+ 15421 

02378 

127 

333 

1160 

10000 

19530 

00000 

+ 19530 

03814 

329 

2,870 

1 0000 





905 





Squaie root = 951 


* Tables for Statisticians , Part 3% Table n 

( 219 ) 




The final figures are the square roots of 


(Zp-gi) 2 1 (Z1-Z2) 2 f 

n l n 2 


fe- z 3 ) 2 
n 3 


-f* etc 


where z 0 , z l3 z 2 , etc are the ordinates at the beginning of 
successive groups and n v n 2 , n z , etc are thf* proportionate 
frequencies, 1 e the successive terms m the preceding table 
col (2). The corrected value is 


•254 

•932 x- 951 


•286 


12 . The first three columns of the table in the preceding 
paragraph are easily constructed, the third is wanted because 
tables of the areas of the “ normal curve ” give those areas from 
any point up to the end of the curve, 1 e the integral from x to 
00 The next column gives the ordinate which can be found r 
from the tables in Tables for Statisticians, Part 1 , where the 
ordinates and areas are m parallel columns, or directly from 
the tables m Part n 

We will now turn to the theoretical side and may consider 
what is the mean of each of the areas n x , n 2 , etc , say, of n s+1 

It will be I xe-^dx / f e~ ix2 dx 

= (e-tai' — e-te+i*) / f * e-^'dx 

/ J %8 hi 

= (z s -z s +i)K+i 

Hence this expression gives us the distance of the mean value 
of the area from the mean of the whole distribution But 
n bJrl is the frequency and therefore 



when summed for all values of s gives the second moment of the 
distribution and the adjustment, being the square root of a 
second moment, is a standard deviation. 

We had assumed the standard deviations to be unity* we 
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have now recalculated them on the facts available and ad- 
justed the result This adjustment in effect removes to a large 
extent the objections indicated m § 10 

13 . It is a little difficult to judge the necessity or success of 
an adjustment m a case of this kind unless j?e know the value 
of the correlation which we ought to reach, and it will probably 
be more convfficing to take one of the coin-tossing tables and, 
having grouped it, see what values of r are found by the 
contingency method without adjustment and how near we get 
to the true value of r with adjustments. For this purpose tke 
table where five coins were common to the pairs of tossings was 
used and a table was formed as follows 


No of heads 
m second 
tossing 

No OF HEADS IN FIRST TOSSING 

Total 

0-4 

5-6 

7-10 

0-3 

2,123 (3,906) 

2,541 (1,606) 

968 (120) 

5,632 

4 

2,534 (3,360) 

3,031 (2,880) 

1,155 (480) 

6,720 

5 

3,038 (2,906) 

3,640 (4,032) 

1,386 (1,126) 

8,064 

6 

2,534 (1,580) 

3,031 (3,560) 

1,155 (1,580) 

6,720 

7-10 

2,123 (600) 

2,541 (2,706) 

968 (2,326) 

5,632 

Total 

12,352 

14,784 

5,632 

32,768 


The zero-contingency figures are m brackets 
<p 2 = 6971/32768 = *2127 
r (unadjusted) = *418 

Then working the adjustment as before, *953 was found for the 
rows and *892 for the columns, so that the adjusted value of 
r is *418/(*953x«92) = *492. 

Ano-y^jj trial may be made with the same coin-tossing table 
throwing it into the form 



0-4 

5 

6-10 

Total 

0-4 

7,266 

2,906 

2,180 

12,352 

5 

2,906 

2,252 

2,906 

8,064 

6-10 

2,180 

2,906 

7,266 

12,352 

Total 

12,352 

8,064 

12,352 

32,768 
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Here <f>* = *171. 


r (unadjusted) — *382 

the factor for rows is *872 and for columns is the same, so that 
the adjusted value for r is 503. 

Now both those' should be 5, but clearly 492 and 503 are 
good approximations with broad groupings, arid the examples 
show both the importance of the adjustment and the accuracy 
attainable. 

' 14 . There is, however, one more aspect of this kind of 
adjustment to which reference may be made We remarked 
(§8) that the method of contingency imphed that we could 
change the order of the columns and rows, but if we do this, 
what will happen to the adjustments? The point is of some 
importance In broad groups where the division is not quanti- 
tative, we may not be sure that if we could express the scale 
quantitatively it would give a distribution of anything hke the 
assumed normal curve. Let us put this to the test by taking 
the grouped figures from one of the tables in the preceding 
paragraph and rearranging them arbitrarily 

Thus we might produce 


Second 

First characteristic 

j TVvf-ol 

characteristic 

«0 

a i 

( 1 2 

bo 

4,032 

1,126 

2,906 

8,064 

h 

3,560 

1,580 

1,580 

6,720 

h 

2,880 

480 

3,360 

6,720 

h 

2,706 

2,326 

600 

5,632 

K 

1,606 

120 

3,906 

5,632 

Total 

>— i 

"-a 

oo 

5,632 

12,352 

^^768 


Clearly <f> 2 and the unadjusted r remain unchanged and so we 
are only concerned with the totals and the procedure of § 11. 
Working with these we reach 944 and *856 as the factors by 
which we adjust* and so find a value of 517 for r This is better 

* The underlying theory of the adjustment is that a normal frequency surface 
could be cut up to give the table This could not, I think, be done m the particular 
rearrangement But the adjustments work well 
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than the unadjusted value — in fact, quite good. The explana- 
tion is that however we divide up the n um bers we get adjust- 
ments which will not vary to an extreme extent unl ess the 
groupmg is exceptional. 

15 . We have already seen that double en^ry tables will show 
small values for a measure of correlation even when there is 
really no correlation and that it is generally more im portant to 
decide whether the apparent correlation is significant than to 
measure exactly the standard error of its coefficient. All we 
need to do m considering a standard error for is therefore 
to compare the actual table with a table formed ass umin g no 
correlation and see if the divergence is significant In practical 
work it is advisable to make this test before working out the 
coefficient Taking, for instance, the table on p 218, x 2 = 198*8 
and we need to find a value for the probability of a divergence 
, as great as or greater than that indicated on the assumption 
that there is no correlation and that the particular table has 
arisen merely m samphng. 

There are 20 cells m the table but as we fix the totals of each 
row and each column this would be too large a number to use 
for n' The correct number of free cells is (h— I) (4 — 1) where 
h is the number of rows and 4 the number of columns. In the 
particular case (h— 1) (4 — 1) = 12 In Tables for Statisticians, 
Part i, Table xn, n' is used as one more than the number of 
free cells, i.e. n’ — n + 1, and we must therefore enter that table 
with n' = 13 and % 2 = 198*8 Knowing the value of r from our 
previous calculations, it is not surprising to find that the chance 
of such a divergence from zero correlation is zero to at least 
six decipo$l places 

In a fourfold table there is only one free cell 

Another way of setting out the method described in this 


section is to say that 

(l) the mean ;\; 2 == (4—1) (4 — 1) (9) 

(n) cr x * = ^{2(11-1) (Jc-1)}/N ....(10) 

when there is no correlation. 
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Formulae (9) and (10) set out the result in a form similar 
to that already given for tj 2 m formulae (4) and (5). 

16 . In working at contingency we have up to the present 
assumed that we calculate the squares of the differences between 
the actual figures pa the table and the corresponding figures 
when there is no correlation, but we may proceed by adding 
together the differences regardless of sign We then obtain the 
mean of these by dividing by the total number of cases 
A diagram m Tables for Statisticians , Part I, gives values of 
r The mathematical work leading to this method is more 
difficult than that for the mean square contingency given 
above and m practical work the latter is more dependable. 

17 . There is yet another method of estimating correlation 
that may be of help It is known as correlation of ranks and 
was suggested by Spearman.* By this method we estimate the 
correlation between, say, the height and span of a number of 
schoolchildren without making an exact measurement for any 
child We first stand the children m order of height and number 
them m the rank, the shortest being numbered 1 and the 
tallest n Then we rearrange the children in order of span and 
again number them- the child with the shortest span being 
numbered 1 and the child with the longest span n The child 
numbered 1 m the height rank might be 3 m the span rank and 
so on 

The next step is to calculate the sum of the squares of the 
differences between the ranks, say S(d 2 ), for the two characters 
(one element m our height and span example would be 
(3 — l) 2 or 4) and we can write ^ 


R = 


6S(d 2 ) 

n(n 2 —l) 




( 11 ) 


TT 

r = 2 sin -it 
o 


( 12 ) 


where R is the coefficient of correlation between ranks and r 


* C Spearman, Amer J Psychol xv, 72, K Pearson, Drapers ’ Company 
Memoir , No 4 

r 
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the corresponding coefficient between variates — similar to 
that discussed m previous chapters The relationship depends 
on the assumption of the normal correlation surface 

The standard error of r found by this method is approxi- 
mately 5 per cent greater than that found by the product- 
moment method of Chapter X 



r 


CHAPTER XIII 


PARTIAL CORRELATION 

1. We have up to the present assumed that we can always 
deal with parrs of related things, but m many investigations, 
especially perhaps m social statistics, the problems are compli- 
cated by a greater number of variables Suppose, for instance, 
we were making a study of infant deaths and trying to ascertain 
the causes chiefly responsible for a high death-rate, we might 
examine the home environment of children m a particular 
district to see whether there was any relation between infant 
deaths and the habits of the mother But the health of the m 
mother may be important also, and if we find correlation 
coefficients m respect of ( 1) infant deaths and habits of mother 
and (2) infant deaths and health of mother, we have up to the 
present found no way of eliminating the possible relation 
between health and habits of the mother In other words, if the 
cause of infant mortality is connected with the habits of the 
mother, is it merely so connected because health and habits 
are connected 2 

2. Let us proceed as we did in dealing with correlation where 
there are only two variables and assume that x x y x z x , x 2 y 2 z 2 , 
etc be associated deviations, and let ^ 

z = a + bx + cy m ^ .. (1) 

As before, we can omit a if we measure every variable from its 
mean. Then using methods of moments we have 

(bx x + cy x ) + (bx 2 + cy 2 ) + = z x + z 2 + . 

(bx x + cy x ) x x + (bx 2 + cy 2 ) x 2 + . = x x z x + x 2 z 2 + . . 

or bS(x 2 ) + cS(xy) = S(xz) . .(2) 

Similarly bS(xy) + cS(y 2 ) = S(yz) . (3) 
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Now, slightly altering the notation used on p 145, we 
write 

8{xy) = Ncr x a y r xy 
S(x*) = No* 

' S(y z ) = Nal 

hence S(a») = No s o.r m 

and S(yz) = N<r y a z r vz 

Substituting m (2) and (3), we have 

bo%+co x cr y r xv = <r x cr z r xz 


or 

b(T x + ccr vr xy = a- z r. 

and 

b ^x r xy 

+ co- y = ar z r. 

hence 

b’l 

II 

>0 

^xz ^yz^xv j 


Vx 

1 — T % 
x 1 xy 1 

and 

C = ^ 

T yz ~ ^xz^xy 1 


°y 

1— r 2 I 

x 1 xy J 


... (4) 


Substituting m (1) and remembering that a = 0, we have 

^ _ (T g r x z'~ r yz r xy x , r yz ~~ r xz r xy y 

<r y ' l-r 2 xy y 

Now when dealing with the two variables we expressed the 
result 

(To ^ 

• f% y = Y~x\ 

0i 

t = r— y 
0*2 * 

so that r the measure of the correlation is the geometric mean 
of the coefficients 

(To -» (Tf 

r~ and r— 
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Similarly with three variables we can write down x m terms of 
z and y or y m terms of x and z , and, again, using the geometric 
mean of the appropriate pairs of coefficients, we have 

r~> _ ^xz ^xy r yz 

* “ _ V{(i f 

r 

as the net (or partial) coefficient between x and z associated with 
a single type of y The square root m the denominator is to be 
taken as positive 

3. Now coefficients of correlation must not exceed unity, 
therefore 

(*■« — Vvzf X 1 - r ly) ( 1 - r lz) 
or, r xz must he between the limits 

^ xy r yz i ~^xy) 

From this we can write down some of the limits that may arise 


when we are dealing with three variables 

If 

Then 

r xy 

v = 

^XZ ~ 

0 

0 

any value 

1 

1 

1 

-1 

-1 

1 

1 

-1 

-1 

0 

±1 

0 

0 

±r 

between ± ^/(l — r 2 ) 

r 

r 

1 between 1 and 

— r 

— r 

J 2r 2 - 1 

r 

— r 

between 1 — 2r 2 and — 1 


4. We may now consider the following numerical example * 

* “ Relative value of factors influencing infant welfare ”, Annals of Eugenics , i, 
178-9 The statistics quoted are from Bradford, 1911 The student will find 
many similar sets of tables ^n this paper 
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Habits of 
Mother 
(x) 

Health of Mother 
(y) 

Total 

Good 

Not good 

Good 1 

Indifferent 

956 

257 

197 

286 " 

1,153 

543 

Total* 

1,213 

4S3 

1,696 


T X y = *567 + *033 


Child dead 

Habits of Mother 
(*) 


or not 

Total 

(*) 

Good 

Indifferent 


Living 

997 

420 

1,417 

Dead 

156 

123 

279 

Total 

1,153 

543 

1,696 


r xz *= *213 ±*046 


Child dead 

Health of Mother 

(y) 


or not 

Total 

w 

Good 

Not good 


Living 

1,065 

352 

1,417 

Dead 

148 

131 

279 

Total 

1,213 

483 

1,696 


r yz = 329 ±*045 


Let us now workout our partial coefficient between 44 Habits 
of MotiteH 5 and “Infantile Deaths” for constant 44 Health of 
Mother”, and we have 

t> *213 — (*329) ( 567) _ 

y xz ~ V(* 891 W(* 678 ) " 

In other words the value, though it looked like *213 at first, now 
proves to be only 034, and as the standard error is about *04 

we could not say that the result is significant. 

% 

( 229 ) 






If we worked out the partial coefficient between “Health of 
Mother” and “Infant Deaths”, keeping “Habits” constant, 
we reach a value of *26, which is significant though smaller 
than the crude figure of • 329 

5. It is possible to extend the theory to a larger number of 
variables, but it seems unnecessary to do so her$ The example 
will give an indication of the use to which such Vork may be 
put, and supplies a warning against accepting the numerical 
value of a coefficient until other causes that may affect the 
result have been considered. 
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APPENDIX I 


CORRECTIONS POR MOMENTS 

1 . The following method has been suggested by E. Rairman 
and K Pearson (BiometnJca, xn, 231 etseq ) when the curve 
rises abruptly at one or both ends 

Let %, n 2 , etc. be the proportionate frequencies m the 1st, 
2nd, etc groups, then put 

a ± — -“eV {*^7% — 163%+ 137% — 63%+ 12%} 

% = ^{4 5% — 109% + 105% — 51%+ 10%} 
a z — — ^{17% — 54% + 64% -34% + 7%} 

% = {3% —11%+ 15% — 9% + 2%} 

% = — {% — 4% + 6% — 4 % + %} 

Similarly, values of b v b 2 etc can be obtained from the 
other end of the distribution 

Then the values of the moments are as follows, where A is 
the distance of the start and B is the distance of the end of 
the distribution from the origin about which moments are 
calculated 

/^] L ” ^1 {t^(% "^0^3 "b 2 § 2 , 0 ®$) “b Y §-(&1 d" 2520 ^ 5 )} 

” ^2""Tkd"{ i'20" (^2 ~ 12 6 ^4) (% — "qq&s + 2520^5)} 

/^3 “ ^3 i ^1 (^1 

- £qA (a 2 - jYe %) + i A 2 (% - + ^ 0 %)} 

= ^4 ~ i + Mo + {lie ( a 2 - 81) ^ 4 ) - K - + sio %) 

— A-A 2 (% — 3^0-%) +iA 3 (%— ^% + 2520 a5 )} 

and similar expressions m B and V s 
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If the moments be taken about the start of the first group so 
that the first group is multiplied by powers of the second by 
powers of § and so on, this expression is simplified so far as the 
a terms are concerned because the terms involving A vanish 

2. The method of reaching these adjustments starts with the 
Euler-Maclaurm expansion and assumes that bhe curve takes 
the form 

l + + etc 

at'the beginning and a similar form at the end This leads to the 
values of the a 9 s 

The differential coefficients at each end required m the Euler- 
Maclaurin expansion are then evolved and the result given is 
reached 

The frequency at the start is approximately 

N 

- - {13771! - 163^ a + 137n 3 - 63^ 4 4- 1 2n 5 } 

By means of this expression we can discover how nearly the 
frequency curve comes to zero at the ends of the range 

3 . A few numerical examples may be given The rule that the 
area, in the case of high contact, can be found by adding 
ordinates when tested by adding 12 ordinates of the normal 
curve calculated to 5 decimal places, gave 1 24998 instead of 
] -25000 Nine ordinates of a Type III curve with high contact 
gave 24473 instead of 24475 

An example of the method of § 1 above is taken from the 
paper there cited. Moments for x 100,000 from x = 0 to 
x = 10 were calculated, the exact result being known. The 
proportional frequencies, which may be taken ay tKe data, 

were: » x = -031623 n 6 = -111205 

m 2 = -057820 n 7 = 120904 

n 3 = -074874 n 8 = 129880 

n 4 = -088665 n 9 = 138273 

n 5 = -100571 n w = 146185 

1-000000 
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From these figures 

«i = --0131,0643 b x = -1499,9857 

a 2 = --0444,8167 b 2 = --0074,9283 

« 3 = -0258,4150 & 3 = --0003,9450 

a 4 =- 0148,8400 & 4 =- -0000,2600 

a h =? -0045,0200 b s = - 0000,3800 

ai-<fe«3 + 2tW«5 = --0135,3533 
a 2 ~ T ¥¥®4 = —"0438,9104 and so on 

Putting A — 0 and -B = 10 and calculating moments about 
the start, we require for the a adjustments 

+ = - 0011,2794 

and the other adjustments m order are 

- 0003,6576, - 0003,7846 and --0003,4269 
For the b terms we have 

Ato-AA + iAlA) = -0121,0043 
^^3 + 2520 ^ 5 ) = '2500,0860 
and the other terms m order give 

3 7501,2825, 50-0017,1000 

- 0000,6244, --0001,8731 

-•0374,6365, -0037,5074 

•1500,2972, --0000,5945 

Finally for the adjusted moments we reach 



^ 

Raw moments 

*• 

— 

With Sheppard’s 
adjustment 

With full 
adjustment 

True value 


v\ 

5 9880 

5r9880 

5 9994 

6 0000 

p'i 

d 

42 6900 

42 6067 

42 8570 

42 8571 


v 'z 

331 0854 

329 5884 

333 3349 

333 3333 

P 4 

v 'l 

2698 7735 

2677 4576 

2727 2757 

2727 2727 


4. The method described above gives good results but is 
laborious. The approximations are less satisfactory in those 
cases where the first group does not relate to a complete umt 
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base and the curve rises abruptly. The same authors gave 
a method for J -shaped distributions, but I should not use it 
as a simple approximation can be found by examining the 
exponential (Type X) 

When statistics ^expressible by the exponential y — y^er*!* 

are stated m groups for each equal subrange h of x , the 

rh r2h ' 

successive groups are y Q e~ xlcr dx, y 0 e~ xlcr dx , etc , or 

Jo' J h 

y 0 cr( 1 — e~~ h/(r ) , y 0 cr( 1 — e~ hl<r ) er h ^ , y 0 <r( 1 — e~~ h f (r ) e“ 27i/(r , etc 
These terms may also be regarded as a geometrical progression, * 
the first term being y 0 <r( l — e~ hfcr ) and the common ratio e~ h l (r . 
It follows that if we treat the areas as a geometrical progression 
extending to infinity, calculate the moments on this assump- 
tion and read the result as graduated terms of a geometrical 
progression, we shall reach correctly graduated areas, and we 
can subsequently write down the equation to the curve with 
little trouble 

Other points are however involved. Let us write the 
geometrical progression as ka x and put A — (1 — a)- 1 , then the 
moments about its mean are 

2nd moment A 2 — A 

3rd „ 2A 3 — 3A 2 + A 

4th „ 9A 4 - 1SA 3 +10A 2 -A 

and if we work out and /? 2 , we get 4 4-A 2 //6 2 and 9 + h 2 /y 2 
respectively. 

Using the exponential, the moments, etc about the mean 
are. // 2 = cr 2 , /i 3 = 2 a 3 , /^ 4 = 9cr 4 , /? x = 4, /? 2 = 9. 

Hence when wo calculate moments, assuming that the 
statistics form a geometrical progression, whereas they are 
really areas from a curve, and seek to choose the type of curve 
from Pearson’s criteria m his system, we shall reach a persistent 
error For this purpose the J3 X and /? 2 found from the statistics 
should be reduced by h 2 f ju, 2 

* “Geometrical piogrcssion” is used throughout to descube a discrete series 
and exponential curve to desmbe a continuous one 
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This rule can be used as an approximation in all d -shaped 
curves and will be found to give satisfactory results. 

So far we have assumed that we know the start of the curve 
and that all the bases of the areas are of equal size If this does 
not apply we can, m the case of an exponential curve, fit the 
curve, excluding the first (incomplete) term, and regard that 
term as related to an appropriate base extrapolated from the 
graduation of the remainder This is an arbitrary arrangement 
but has practical advantages 
In other J -shaped curves in similar circumstances the first 
step would be to assume an exponential, to find therefrom 
approximately the base of the first incomplete group, and then 
assume that the area is concentrated at the middle point. This 
will generally give good results the assumption of the expo- 
nential overstates the base and the assumption of half-way 
assumes a less rapidly falling curve than the J -shaped forms 
of Types I and III There is therefore a balance of error. 

Turning to the statistical side, the example on p. 108 gives 
pb 2 — 2 045, /?! = 4 629, /? 2 = 9*502 These figures come from 
the unadjusted moments, and deducting *49 from the above 
values for /? x and /? 2 we reach 4 14 and 9*01 The theoretical 
values when an exponential curve is to be used are 4 and 9 
If we apply the rule as an approximation m other J -shaped 
cases we find that m the example on p 112, where a twisted 
J-shaped curve is given, ju, 2 — 4*266, /? x = *761, /? 2 = 2 646, and 
the adjustment leads to /? x = *527 and /? 2 = 2 412 Hence 
5/? 2 — 6/? x — 9 becomes — *098 instead of — *368 The theoretical 
criterion would lC3td us to expect 5/? 2 — 6/? x — 9 = 0 

Thes^ examples are not, of course, complete evidence, but 
they show that the suggestion may lead to accurate results, 
and it has the merit of simphcity. The rule with regard to the 
adjustment of the /?’s by h 2 /ju, 2 may be combined with the 
approximations given on p 109, where it is mentioned that the 

A 2 

mean is overstated, when is positive, by about an ^ 
the second moment about the true mean (i e the mean as 
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corrected by h 2 /(12cr)) is understated by about — We do not 

know a exactly but can use the square root of the second 
moment as found from the calculations If h be taken as a 
unit and the moments found m terms of h, i.e m working 
units, the corrections are 1/(12 ^// 2 ) and „ 

5 . An alternative to the method of § 1 is to find mid-ordinates 
corresponding to the areas of the groups and treat these mid- 
ordinates m the manner explained m Chapter III, § 18 
r The mid-ordinates m v m 2 , etc are found by the following 
equations 

^3 = Tihro {2134^3 — 1 1 6 (t 2/ 2 + % 4 ) 4- 9(^i + ^ 5 )} 
w 2 = 0 { 7 1 % + 2044^ 2 — 26n 3 — 36^ 4 + 9%} 

ni y = + 684?*, 2 — 746^ 3 + 364tz, 4 — 71n 5 } 

Tlie total frequency is not exactly reproduced but the moments 
obtained are good approximations 

6 * It has been pointed out that one of the difficulties m 
calculating moments when the curve rises abruptly at one or 
both ends arises because the true start or end of the curve 
is unknown In other words, the base of the first area or last 
area (or both) is smaller than that of the other areas In 
practice good results can often bo obtained with unadjusted 
moments but the first attempt may require modification by 
varying the range of the curve (see p 124). When this is 
done, the moment contribution for the first area, or the last, 
or both, must be recalculated by assuming that the area is 
concentrated at the middle of the smaller ^base 

E. >S. Martin has approached the problem more s«yst£matic- 
ally in a paper m Biometnka , xxiv, 12 , and has given tables 
from which the start of the curve may be estimated. 
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